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PREFACE 


This book and its companion volume Basic Real Analysis systematically develop 
concepts and tools in real analysis that are vital to every mathematician, whether 
pure or applied, aspiring or established. The two books together contain what the 
young mathematician needs to know about real analysis in order to communicate 
well with colleagues in all branches of mathematics. 

The books are written as textbooks, and their primary audience is students 
who are learning the material for the first time and who are planning a career in 
which they will use advanced mathematics professionally. Much of the material 
in the books corresponds to normal course work. Nevertheless, it is often the 
case that core mathematics curricula, time-limited as they are, do not include all 
the topics that one might like. Thus the book includes important topics that are 
sometimes skipped in required courses but that the professional mathematician 
will ultimately want to learn by self-study. 

The content of the required courses at each university reflects expectations of 
what students need before beginning specialized study and work ona thesis. These 
expectations vary from country to country and from university to university. Even 
so, there seems to be a rough consensus about what mathematics a plenary lecturer 
at a broad international or national meeting may take as known by the audience. 
The tables of contents of the two books represent my own understanding of what 
that degree of knowledge is for real analysis today. 


Key topics and features of Advanced Real Analysis are that it: 


e Develops Fourier analysis and functional analysis with an eye toward partial 
differential equations. 

e Includes chapters on Sturm—Liouville theory, compact self-adjoint operators, 
Euclidean Fourier analysis, topological vector spaces and distributions, com- 
pact and locally compact groups, and aspects of partial differential equations. 

e Contains chapters about analysis on manifolds and foundations of probability. 
Proceeds from the particular to the general, often introducing examples well 
before a theory that incorporates them. 

e Includes many examples and almost 200 problems, and a separate section 
“Hints for Solutions of Problems” at the end of the book gives hints or complete 
solutions for most of the problems. 
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e Incorporates, both in the text and in the problems but particularly in the 
problems, material in which real analysis is used in algebra, in topology, 
in complex analysis, in probability, in differential geometry, and in applied 
mathematics of various kinds. 


It is assumed that the reader has had courses in real variables and either is 
taking or has completed the kind of course in Lebesgue integration that might use 
Basic Real Analysis as a text. Knowledge of the content of most of Chapters I-VI 
and X of Basic Real Analysis is assumed throughout, and the need for further 
chapters of that book for particular topics is indicated in the chart on page xiv. 
When it is necessary in the text to quote a result from this material that might 
not be widely known, a specific reference to Basic Real Analysis is given; such 
references abbreviate the book title as Basic. 

Some understanding of complex analysis is assumed for Sections 3-4 and 6 of 
Chapter III, for Sections 10-11 of Chapter IV, for Section 4 of Chapter V, for all 
of Chapters VII and VHI, and for certain groups of problems, but not otherwise. 
Familiarity with linear algebra and group theory at least at the undergraduate level 
is helpful throughout. 


The topics in the first eight chapters of this volume are related to one another 
in many ways, and the book needed some definite organizational principle for its 
design. The result was a decision to organize topics largely according to their role 
in the study of differential equations, even if differential equations do not explicitly 
appear in each of the chapters. Much of the material has other uses as well, but 
an organization of topics with differential equations in mind provides a common 
focus for the mathematics that is presented. Thus, for example, Fourier analysis 
and functional analysis are subjects that stand on their own and also that draw 
on each other, but the writing of the chapters on these areas deliberately points 
toward the subject of differential equations, and toward tools like distributions 
that are used with differential equations. These matters all come together in two 
chapters on differential equations, Chapters VII and VIII, near the end of in the 
book. 

Portions of the first eight chapters can be used as the text for a course in any 
of three ways. One way is as an introduction to differential equations within a 
course on Lebesgue integration that treats integration and the Fourier transform 
relatively lightly; the expectation in this case is that parts of at most two or three 
chapters of this book would be used. A second way is as a text for a self-contained 
topics course in differential equations; the book offers a great deal of flexibility 
for the content of such a course, and no single choice is right for everyone. A 
third way is simply as a text for a survey of some areas of advanced real analysis; 
again the book offers great flexibility in how such a course is constructed. 

The problems at the ends of chapters are an important part of the book. Some 
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of them are really theorems, some are examples showing the degree to which 
hypotheses can be stretched, and a few are just exercises. The reader gets no 
indication which problems are of which type, nor of which ones are relatively 
easy. Each problem can be solved with tools developed up to that point in the 
book, plus any additional prerequisites that are noted. 

This book seeks in part to help the reader look for and appreciate the unity of 
mathematics. For that reason some of the problems and sections go way outside 
the usual view of real analysis. One of the lessons about advanced mathematics 
is that progress is better measured by how mathematics brings together different 
threads, rather than how many new threads it generates. 


Almost all of the mathematics in this book and Basic Real Analysis is at least 
forty years old, and I make no claim that any result is new. The two books are 
together a distillation of lecture notes from a 35-year period of my own learning 
and teaching. Sometimes a problem at the end of a chapter or an approach to the 
exposition may not be a standard one, but normally no attempt has been made to 
identify such problems and approaches. 

Iam grateful to Ann Kostant and Steven Krantz for encouraging this project and 
for making many suggestions about pursuing it, and to Susan Knapp and David 
Kramer for helping with the readability. The typesetting was by AjyS-TpxX, and 
the figures were drawn with Mathematica. 

I invite corrections and other comments from readers. I plan to maintain a list 
of known corrections on my own Web page. 

A. W. KNAPP 
June 2005 


DEPENDENCE AMONG CHAPTERS 


The chart below indicates the main lines of logical dependence of sections of 
Advanced Real Analysis on earlier sections and on chapters in Basic Real Analysis. 
Starting points are the boxes with double ruling. All starting points take Chapters 


I-VI and X of Basic Real Analysis as known. 


VI.8 


= 


VIll.1-VII.4 


Basic XI Basic XII Basic VIUI-Ix Basic XI-XII 
and L? | | 
II.1-II.2 TIl.1 
VI.1- 11.3-I1.4 Tl.2—-I1.4 IV.1-IV.3 
V1.6 

IV.5— 

Iv4 v6 

IV.8— 
IV9 IN 
Basic ; IV.10— 
V1.7 VIL3 V.1-V4 


VII.5-VI1.6 


VIII.1-VIl.4 


VUI.S5S—VIII.7 


GUIDE FOR THE READER 


This section is intended to help the reader find out what parts of each chapter are 
most important and how the chapters are interrelated. Further information of this 
kind is contained in the chart on page xiv and in the abstracts that begin each of 
the chapters. 

Advanced Real Analysis deals with topics in real analysis that the young 
mathematician needs to know in order to communicate well with colleagues 
in all branches of mathematics. These topics include parts of Fourier analysis, 
functional analysis, spectral theory, distribution theory, abstract harmonic analy- 
sis, and partial differential equations. They tend to be ones whose applications 
and ramifications cut across several branches in mathematics. Each topic can 
be studied on its own, but the importance of the topic arises from its influence 
on the other topics and on other branches of mathematics. To avoid having all 
these relationships come across as a hopeless tangle, the book needed some 
organizational principle for its design. The principle chosen was largely to 
organize topics according to their role in the study of differential equations. This 
organizational principle influences what appears below, but it is certainly not 
intended to suggest that applications to differential equations are the only reason 
for studying certain topics in real analysis. 

As was true also in Basic Real Analysis, several techniques that are used 
repeatedly in real analysis play a pivotal role. Examples are devices for justifying 
interchanges of limits, compactness and completeness as tools for proving exis- 
tence theorems, and the approach of handling nice functions first and then passing 
to general functions. By the beginning of the present volume, these techniques 
have become sophisticated enough so as to account for entire areas of study within 
real analysis. The theory of weak derivatives illustrates this principle: The theory 
allows certain interchanges of limits involving weak derivatives to be carried out 
routinely, and the hard work occurs in translating the results into statements about 
classical derivatives. The main tool for this translation is Sobolev’s Theorem, 
which in turn becomes the foundation for its own theory. 


Each chapter is built around one or more important theorems. The commentary 
below tells the nature of each chapter and the role of some important theorems. 

Chapter I marks two transitions—from concrete mathematics done by cal- 
culation to theorems established by functional analysis on the one hand, and 
from ordinary differential equations to partial differential equations on the other 
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hand. Section 2 about separation of variables is relatively elementary, introducing 
and illustrating a first technique for approaching partial differential equations. 
The technique involves a step of making calculations and a step of providing 
justification that the method is fully applicable. When the technique succeeds, 
the partial differential equation is reduced to two or more ordinary differential 
equations. Section 3 establishes, apart from one detail, the main theorem of 
the chapter, called Sturm’s Theorem. Sturm’s Theorem addresses the nature of 
solutions of certain kinds of ordinary differential equations with a parameter. 
This result can sometimes give a positive answer to the completeness questions 
needed to justify separation of variables, and it hints at a theory known as Sturm— 
Liouville theory that contains more results of this kind. The one detail with 
Sturm’s Theorem that is postponed from Section 3 to Chapter II is the Hilbert— 
Schmidt Theorem. 


Chapter II is a first chapter on functional analysis beyond Chapter XII of Basic 
Real Analysis, with emphasis on a simple case of the Spectral Theorem. The 
result in question describes the structure of compact self-adjoint operators on a 
Hilbert space. The Hilbert-Schmidt Theorem says that certain integral operators 
are of this kind, and it completes the proof of Sturm’s Theorem as presented in 
Chapter I; however, Chapter I is not needed for an understanding of Chapter II. 
Section 4 of Chapter II gives several equivalent definitions of unitary operators 
and is relevant for many later chapters of the book. Section 5 discusses compact, 
Hilbert—Schmidt, and trace-class operators abstractly and may be skipped on first 
reading. 

Chapter III is a first chapter on Fourier analysis beyond Chapters VIII and IX of 
Basic Real Analysis, and it discusses four topics that are somewhat independent of 
one another. The first of these, in Sections 1—2, introduces aspects of distribution 
theory and the idea of weak derivatives. The main result is Sobolev’s Theorem, 
which tells how to extract conclusions about ordinary derivatives from conclusions 
about weak derivatives. Readers with a particular interest in this topic will want 
to study also Problems 8-12 and 25-34 at the end of the chapter. Sections 3-4 
concern harmonic functions, which are functions annihilated by the Laplacian, 
and associated Poisson integrals, which relate harmonic functions to the subject of 
boundary-value problems. These sections may be viewed as providing an example 
of what to expect of the more general “elliptic” differential operators to be studied 
in Chapters VII-VIII. The main results are a mean value property for harmonic 
functions, a maximum principle, a reflection principle, and a characterization 
of harmonic functions in a half space that arise as Poisson integrals. Sections 
5-6 establish the Calder6n—Zygmund Theorem and give two applications to 
partial differential equations. The theorem generalizes the boundedness of the 
Hilbert transform, which was proved in Chapters VII-IX of Basic Real Analysis. 
Historically the Calder6n—Zygmund Theorem was a precursor to the theory of 
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pseudodifferential operators that is introduced in Chapter VII. Sections 7-8 gently 
introduce multiple Fourier series, which are used as a tool several times in later 
chapters. 


Chapter IV weaves together three lines of investigation in the area of func- 
tional analysis—one going toward spaces of smooth functions and distribution 
theory, another leading to fixed-point theorems, and a third leading to full-fledged 
spectral theory. The parts of the chapter relevant for spaces of smooth functions 
and distribution theory are Sections 1-2 and 5-7. This line of investigation 
continues in Chapters V and VII—-VIII. The parts of the chapter relevant for fixed- 
point theorems are Sections 1, 3-6, and 8-9. Results of this kind, which have 
applications to equilibrium problems in economics and mathematical physics, are 
not pursued beyond Chapter IV in this book. The parts of the chapter relevant 
to spectral theory are Sections 1, 3-4, and 10-11, and spectral theory is not 
pursued beyond Chapter IV. Because the sections of the chapter have overlapping 
purposes, some of the main results play multiple roles. Among the main results 
are the characterization of finite-dimensional topological vector spaces as being 
Euclidean, the existence of “support” for distributions, Alaoglu’s Theorem assert- 
ing weak-star compactness of the closed unit ball of the dual of a Banach space, 
the Stone Representation Theorem as a model for the theory of commutative C* 
algebras, a separation theorem concerning continuous linear functionals in locally 
convex topological vector spaces, the construction of inductive limit topologies, 
the Krein—Milman Theorem concerning the existence of extreme points, the 
structure theorem for commutative C* algebras, and the Spectral Theorem for 
commuting families of bounded normal operators. Spectral theory has direct 
applications to differential equations beyond what appears in Chapters [-II, but 
the book does not go into these applications. 


Chapter V develops the theory of distributions, and of operations on them, 
without going into their connection with Sobolev spaces. The chapter includes a 
lengthy discussion of convolution. The main results are a structure theorem for 
distributions of compact support in terms of derivatives of measures, a theorem 
saying that the Fourier transforms of such distributions are smooth functions, and 
a theorem saying that the convolution of a distribution of compact support and 
a tempered distribution is meaningful and tempered, with its Fourier transform 
being the product of the Fourier transforms. 


Chapter VI introduces harmonic analysis using groups. Section 1 concerns 
general topological groups, Sections 2—5 are about invariant measures on locally 
compact groups and their quotients, and Sections 6—7 concern the representation 
theory of compact groups. Section 8 indicates how representation theory sim- 
plifies problems concerning linear operators with a sizable group of symmetries. 
One main result of the chapter is the existence and uniqueness of Haar measure, 
up to a scalar factor, on any locally compact group. Another is the Peter-Wey] 
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Theorem, which is a completeness theorem for Fourier analysis on a general 
compact group akin to Parseval’s Theorem for Fourier series and the circle group. 
The proof of the Peter-Weyl Theorem uses the Hilbert-Schmidt Theorem. 

Chapter VII is a first systematic discussion of partial differential equations, 
mostly linear, using tools from earlier chapters. Section 1 seeks to quantify 
the additional data needed for a differential equation or system simultaneously to 
have existence and uniqueness of solutions. The Cauchy—Kovalevskaya Theorem, 
which assumes that everything is holomorphic, is stated in general and gives a 
local result; for special kinds of systems it gives a global result whose proof 
is carried out in problems at the end of the chapter. Section 2 mentions some 
other properties and examples of differential equations, including the possibility 
of nonexistence of local solutions for linear equations Lu = f when f is not 
holomorphic. Section 3 contains a general theorem asserting local existence of 
solutions for linear equations Lu = f when L has constant coefficients; the proof 
uses multiple Fourier series. Section 5 concerns elliptic operators L with constant 
coefficients; these generalize the Laplacian. A complete proof is given in this case 
for the existence of a “parametrix” for L, which leads to control of regularity of 
solutions, and for the existence of “fundamental solutions.” Section 6 introduces, 
largely without proofs, a general theory of pseudodifferential operators. To focus 
attention on certain theorems, the section describes how the theory can be used 
to obtain parametrices for elliptic operators with variable coefficients. 

Chapter VIII in Sections 1-4 introduces smooth manifolds and vector bundles 
over them, particularly the tangent and cotangent bundles. Readers who are 
already familiar with this material may want to skip these sections. Sections 
5-8 use this material to extend the theory of differential and pseudodifferential 
operators to the setting of smooth manifolds, where such operators arise naturally 
in many applications. Section 7 in particular describes how to adapt the theory 
of Chapter VII to obtain parametrices for elliptic operators on smooth manifolds. 

Chapter IX is a stand-alone chapter on probability theory. Although partial 
differential equations interact with probability theory and have applications to 
differential geometry and financial mathematics, such interactions are too ad- 
vanced to be addressed in this book. Instead three matters are addressed that are 
foundational and yet at the level of this book: how measure theory is used to model 
real-world probabilistic situations, how the Kolmogorov Extension Theorem con- 
structs measure spaces that underlie stochastic processes, and how probabilistic 
independence and a certain indifference to the nature of the underlying measure 
space lead to a proof of the Strong Law of Large Numbers. 


NOTATION AND TERMINOLOGY 


This section lists notation and a few unusual terms from elementary mathematics 
and from Basic Real Analysis that are taken as standard in the text without further 
definition. The items are grouped by topic. 


Set theory 

€ membership symbol 

#S or |S| number of elements in S 

@ empty set 

{x € E | P} the set of x in F such that P holds 

ES complement of the set FE 

EUF, ENF, E-F union, intersection, difference of sets 

ID ier Frm Gn eee union, intersection of the sets E, 

ECF, EDF E is contained in F’, E contains F 

EXF, XyesXs products of sets 

(a1,...,Qy) ordered n-tuple 

{a1,..., an} unordered n-tuple 

fi: Eo F,xw f(x) function, effect of function 

fog, f | . composition of f following g, restriction to E 
fC.y) the function x f(x, y) 

f(E), f7\(E) direct and inverse image of a set 

countable finite or in one-one correspondence with integers 
24 set of all subsets of A 

BA set of all functions from B to A 

card A cardinality of A 

Number systems 

bij Kronecker delta: 1ifi = j,0ifi # j 

(7) binomial coefficient 

n positive, n negative n>0O,n <0 

Z,Q,R,C integers, rationals, reals, complex numbers 

F R or C, the underlying field of scalars 

max maximum of finite subset of a totally ordered set 
min minimum of finite subset of a totally ordered set 


> or [] sum or product, possibly with a limit operation 
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greatest integer < x if x is real 

real and imaginary parts of complex z 
complex conjugate of z 

absolute value of z 


Linear algebra and elementary group theory 


R’,C",F" 
x:-y 


diag(a1,..., An) 
TrA 


spaces of column vectors with n entries 
dot product 

j' standard basis vector of R” 

identity matrix or operator 

determinant of A 

transpose of A 

diagonal square matrix 

trace of A 

matrix with (i, j)™ entry Mj; 
dimension of vector space 

additive identity in an abelian group 
multiplicative identity in a group or ring 
is isomorphic to, is equivalent to 


Real-variable theory and calculus 


R* 

sup and inf 
(a,b), [a,b] 

(a, b}, la, b) 
lim sup,,, lim inf, 
lim 

|| 

e 


exp x, sinx, cos x, tanx 
arcsin x, arctan x 


CK(V),k >0 


cé(Vv) 
f:V — Fis smooth 


homogeneous of degree d 


extended reals, reals with too adjoined 
supremum and infimum in R* 

open interval in R*, closed interval 
half-open intervals in R* 

inf, sup,,, in R*, sup, inf,>, in R* 

limit in R or R* or RY 

oer |x, [?)'/ if x = (x,,...,Xy), scalars 
inRorC 

Dea ye 

exponential and trigonometric functions 
inverse trigonometric functions 

natural logarithm function on (0, +00) 
partial derivative of f with respect to j™ variable 


scalar-valued functions on open set V C R% 
with all partial derivatives continuous through 
order k, no assumption of boundedness 
Mock) 

f is scalar valued and is in C®(V) 

satisfying f (rx) =r f (x) for all x 40 in RY 
and allr > Oif f is a function f : RY —{0} > F 
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Metric spaces and topological spaces 


A? 
separable 
D(x, A) 
Xn > x orlimx, =x 
SN-1 

support of function 
IF llsup 

B(S) 

B(S,C) or B(S, R 
C(S) 


wm 


C(S, C) or C(S, R) 
Ccom(S) 
Co(S) 


XxX * 
Measure theory 


m(E) or |E| 
indicator function of set E 


= 
I, fdwor fp f(x) du(x) 
dx 


Jo fax 


(X, A, 1) or (X, 1) 
ae. [du] 

v= fdu 

AxB 

wxov 


Ifllp 


P 
L?(X, A, w) or L?(X, pw) 


typical name for a metric 

open ball of radius r and center x 

closure of A 

interior of A 

having a countable base for its open sets 

distance to a set A in a metric space 

limit relation for a sequence or a net 

unit sphere in RY 

closure of set where function is nonzero 

sup,cs | f(x)| if f : X — F is given 

space of all bounded scalar-valued functions on $ 
space of members of B(S) with values in C or R 
space of all bounded scalar-valued continuous 
functions on S if S topological 

space of members of C(S) with values in C or R 
space of functions in C(S) with compact support 
space of functions in C(S) vanishing 

at infinity if S is locally compact Hausdorff 
one-point compactification of X 


Lebesgue measure of EF 

function equal to 1 on E,0 off E 

indicator function of F at x 

max(f,0) for f with values in R* 

—min(f,0) for f with values in R* 

Lebesgue integral of f over E with respect to ju 
abbreviation for d(x) for ~=Lebesgue measure 
Lebesgue integral of f on interval (a, b) 

with respect to Lebesgue measure 

typical measure space 

almost everywhere with respect to 

complex measure v with v(E) = Ne fdu 
product of o-algebras 

product of o-finite measures 

L? norm, 1 < p < co 

dual index to p with p’ = p/(p — 1) 

space of functions with || f||,, < oo modulo 
functions equal to 0 a.e. [du] 
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f*2 convolution 

f°) Hardy—Littlewood maximal function, given by 
the supremum of the averages of | f| over balls 
centered at x 


do spherical part of Lebesgue measure on RY, 
measure on SY! with dx =r%~!dr dw 
Qn_1 “area” of SN—! given by Qy_; = Ssv-1 dw 
T(s) gamma function with I'(s) = {5° t°~!e~' dt 
v<p v is absolutely continuous with respect to 


Borel set in locally compact set in o-algebra generated by compact sets in X 
Hausdorff space X 


B(X) o-algebra of Borel sets if X is locally compact 
Hausdorff 

compact G5; compact set equal to countable intersection of 
open sets 


Baire set in locally compact set in o-algebra generated by compact G3’s in X 
Hausdorff space X 


M(X) space of all finite regular Borel complex 
measures on X if X is locally compact Hausdorff 
M(X,C) or M(X,R) M(X) with values in F = CorF=R 


Fourier series and Fourier transform 

= = jae f(x)e""™ dx Fourier coefficient 

f@~~ pooner c,e™ Fourier series of f, with c, as above 
Sy(f3x) = ys N cnet partial sum of Fourier series 

fv) = few f(x)e?""* dx Fourier transform of an f in L'(R%) 
f(x) = few f(ye""* dy Fourier inversion formula 

F Fourier transform as an operator 


IF fll, = IF ll. Plancherel formula 
S or S(R®) Schwartz space on R 
+ lim, 0 Ji Ase £ Go) dt Hilbert transform of function f on R! 


Normed linear spaces and Banach spaces 
ll - ll typical norm in a normed linear space 


(-, +) typical inner product in a Hilbert space, 

linear in first variable, conjugate linear in second 
Mt space of vectors orthogonal to all members of M 
x* dual of normed linear space X 
l canonical mapping of X into X** = (X*)* 


B(x, Y) space of bounded linear operators from X into Y 


Advanced Real Analysis 


CHAPTER I 


Introduction to Boundary-Value Problems 


Abstract. This chapter applies the theory of linear ordinary differential equations to certain 
boundary-value problems for partial differential equations. 

Section | briefly introduces some notation and defines the three partial differential equations of 
principal interest—the heat equation, Laplace’s equation, and the wave equation. 

Section 2 is a first exposure to solving partial differential equations, working with boundary-value 
problems for the three equations introduced in Section 1. The settings are ones where the method of 
“separation of variables” is successful. In each case the equation reduces to an ordinary differential 
equation in each independent variable, and some analysis is needed to see when the method actually 
solves a particular boundary-value problem. In simple cases Fourier series can be used. In more 
complicated cases Sturm’s Theorem, which is stated but not proved in this section, can be helpful. 

Section 3 returns to Sturm’s Theorem, giving a proof contingent on the Hilbert-Schmidt Theorem, 
which itself is proved in Chapter II. The construction within this section finds a Green’s function for 
the second-order ordinary differential operator under study; the Green’s function defines an integral 
operator that is essentially an inverse to the second-order differential operator. 


1. Partial Differential Operators 


This chapter contains a first discussion of linear partial differential equations. The 
word “equation” almost always indicates that there is a single unknown function, 
and the word “partial” indicates that this function probably depends on more than 
one variable. In every case the equation will be homogeneous in the sense that it 
is an equality of terms, each of which is the product of the unknown function or 
one of its iterated partial derivatives to the first power, times a known coefficient 
function. Consequently the space of solutions on the domain set is a vector 
space, a fact that is sometimes called the superposition principle. The emphasis 
will be on a naive-sounding method of solution called “separation of variables” 
that works for some equations in some situations but not for all equations in all 
situations. This method, which will be described in Section 2, looks initially for 
solutions that are products of functions of one variable and hopes that all solutions 
can be constructed from these by taking linear combinations and passing to the 
limit. 
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For the basic existence-uniqueness results with ordinary differential equations, 
one studies single ordinary differential equations in the presence of initial data 
of the form y(fo) = yo,..., yD (ay) = Agee Implicitly the independent 
variable is regarded as time. For the partial differential equations in the settings 
that we study in this section, the solutions are to be defined in a region of space 
for all time t > O, and the corresponding additional data give information to be 
imposed on the solution function at the boundary of the resulting domain in space- 
time. Behavior at t = 0 will not be sufficient to determine solutions uniquely; 
we shall need further conditions that are to be satisfied for all t > O when the 
space variables are at the edge of the region of definition. We refer to these two 
types of conditions as initial data and space-boundary data. Together they are 
simply boundary data or boundary values. 

For the most part the partial differential equations will be limited to three—the 
heat equation, the Laplace equation, and the wave equation. Each of these involves 
space variables in some R”, and the heat and wave equations involve also a time 
variable t. To simplify the notation, we shall indicate partial differentiations by 
subscripts; thus u,; is shorthand for 07u / dxdt. The space variables are usually 
X1,...,Xn, but we often write x, y, z for them if n < 3. The linear differential 
operator A given by 

AU = Uyyxy +++ + Ux x, 


is involved in the definition of all three equations and is known as the Laplacian 
in n space variables. 
The first partial differential equation that we consider is the heat equation, 
which takes the form 
u; = Au, 


the unknown function u(x1,..., Xn, t) being real-valued in any physically mean- 
ingful situation. Heat flows by conduction, as a function of time, in the region 
of the space variables, and this equation governs the temperature on any open 
set where there are no external influences. It is usually assumed that external 
influences come into play on the boundary of the space region, rather than the 
interior. They do so through a given set of space-boundary data. Since time and 
distance squared have distinct physical units, some particular choice of units has 
been incorporated into the equation in order to make a certain constant reduce 
to 1. 

The second partial differential equation that we consider is the Laplace 
equation, which takes the form 


Au = 0, 


the unknown function u(x), ...,%X,) again being real-valued in any physically 
meaningful situation. A C? function that satisfies the Laplace equation on an 
open set is said to be harmonic. The potential due to an electrostatic charge is 
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harmonic on any open set where the charge is 0, and so are steady-state solutions 
of the heat equation, i.e., those solutions with time derivative 0. 

The third and final partial differential equation that we consider is the wave 
equation, which takes the form 


uy = Au, 


the unknown function u(x), ...,X,) once again being real-valued in any physi- 
cally meaningful situation. Waves of light or sound spread in some medium in 
space as a function of time. In our applications we consider only cases in which 
the number of space variables is 1 or 2, and the function u is interpreted as the 
displacement as a function of the space and time variables. 


2. Separation of Variables 


We shall describe the method of separation of variables largely through what 
happens in examples. As we shall see, the rigorous verification that separation of 
variables is successful in a particular example makes serious analytic demands 
that bring together a great deal of real-variable theory as discussed in Chapters 
I-IV of Basic.! The general method of separation of variables allows use of a 
definite integral of multiples of the basic product solutions, but we shall limit 
ourselves to situations in which a sum or an infinite series of multiples of basic 
product solutions is sufficient. Roughly speaking, there are four steps: 


(i) Search for basic solutions that are the products of one-variable functions, 
and form sums or infinite series of multiples of them (or integrals in a 
more general setting). 


(ii) Use the boundary data to determine what specific multiples of the basic 
product solutions are to be used. 


(iii) Address completeness of the expansions as far as dealing with all sets of 
boundary data is concerned. 

(iv) Justify that the obtained solution has the required properties. 
Steps (i) and (ii) are just a matter of formal computation, but steps (iii) and (iv) 
often require serious analysis. In step (iii) the expression “all sets of boundary 
data” needs some explanation, as far as smoothness conditions are concerned. 
The normal assumption for the three partial differential equations of interest is 
that the data have two continuous derivatives, just as the solutions of the equations 
are to have. Often one can verify (iii) and carry out (iv) for somewhat rougher 


'Throughout this book the word “Basic” indicates the companion volume Basic Real Analysis. 
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data, but the verification of (iv) in this case may be regarded as an analysis problem 
separate from solving the partial differential equation. 

The condition that the basic product solutions in (i) form a discrete set, so that 
the hoped-for solutions are given by infinite series and not integrals, normally 
results from assuming that the space variables are restricted to a bounded set and 
that sufficiently many boundary conditions are specified. In really simple situa- 
tions the benefit that we obtain is that an analytic problem potentially involving 
Fourier integrals is replaced by a more elementary analytic problem with Fourier 
series; in more complicated situations we obtain a comparable benefit. Step (iii) 
is crucial since it partially addresses the question whether the solution we seek is 
at all related to basic product solutions. Let us come back to what step (iii) entails 
in a moment. Step (iv) is a matter of interchanges of limits. One step consists 
in showing that the expected solution satisfies the partial differential equation, 
and this amounts to interchanging infinite sums with derivatives. It often comes 
down to the standard theorem in real-variable theory for that kind of interchange, 
which is proved in the real-valued case as Theorem 1.23 of Basic and extended 
to the vector-valued case later. We restate it here in the vector-valued case for 
handy reference. 


Theorem 1.1. Suppose that { f,,} is a sequence of functions on an interval with 
values in a finite-dimensional real or complex vector space V. Suppose further 
that the functions are continuous for a < t < b and differentiable fora < t < b, 
that { f/} converges uniformly fora < t < b, and that { f,(xo)} converges in V 
for some xo with a < xo < b. Then {f,,} converges uniformly for a < t < b to 
afunction f,and f’(x) = lim, f/x) fora <x < b, with the derivative and the 
limit existing. 


Another step in handling (iv) consists in showing that the expected solution has 
the asserted boundary values. This amounts to interchanging infinite sums with 
passages to the limit as certain variables tend to the boundary, and the following 
result can often handle that. 


Proposition 1.2. Let X be a set, let Y be a metric space, let A,(x) be a 
sequence of complex-valued functions on X such that )°™°, |An(x)| converges 
uniformly, and let B,(y) be a sequence of complex-valued functions on Y such 
that |B,(y)| < 1 for all n and y and such that limy_,,, Bn(y) = Bn(yo) for all n. 
Then 


ee) 


lim > An(«)Bn(y) =) An(&) Bn (0), 
n=1 


ee 
y~ Yo ar 


and the convergence is uniform in x if, in addition to the above hypotheses, each 
A, (x) is bounded. 
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PROOF. Lete > Obe given, and choose N large enough so that es v1! An(®)| 
is < «. Then 


| 2 An) Bu) = D> An()Bn(0)] = | > An)(Bn() = Bn0))| 
n=1 =1 n=1 


N ee) 
< 0 lAn(®)| Bn) — Bn(vo)l| +2 D5 [An 
n=1 


n=N+1 


N 
< 26 + 9) [An(2)| |Buy) — Bn): 


n=1 


For y close enough to yo, the second term on the right side is < €, and the pointwise 
limit relation is proved. The above argument shows that the convergence is 
uniform in x if maxj<n<y |An(x)| < M independently of x. 


In combination with a problem’ in Basic, Proposition 1.2 shows, under the 
hypotheses as stated, that if X is a metric space and if Se An(x)Bn(y) 1s 
continuous on X x (Y — {yo}), then it is continuous on X x Y. This conclusion 
can be regarded, for our purposes, as tying the solution of the partial differential 
equation well enough to one of its boundary conditions. It is in this sense that 
Proposition 1.2 contributes to handling part of step (iv). 

Let us return to step (iii). Sometimes this step is handled by the completeness 
of Fourier series as expressed through a uniqueness theorem? or Parseval’s Theo- 
rem.* But these methods work in only a few examples. The tools necessary to deal 
completely with step (iii) in all discrete cases generate a sizable area of analysis 
known in part as “Sturm—Liouville theory,’ of which Fourier series is only the 
beginning. We do not propose developing all these tools, but we shall give in 
Theorem 1.3 one such tool that goes beyond ordinary Fourier series, deferring 
any discussion of its proof to the next section. 

For functions defined on intervals, the behavior of the functions at the endpoints 
will be relevant to us: we say that a continuous function f : [a,b] > C witha 
derivative on (a, b) has a continuous derivative at one or both endpoints if f’ has 
a finite limit at the endpoint in question; it is equivalent to say that f extends toa 
larger set so as to be differentiable in an open interval about the endpoint and to 
have its derivative be continuous at the endpoint. 


Theorem 1.3 (Sturm’s Theorem). Let p,q, andr be continuous real-valued 
functions on [a, b] such that p’ and r” exist and are continuous and such that p 


Problem 6 at the end of Chapter IT. 
3Corollaries 1.60 and 1.66 in Basic. 
“Theorem 1.61 in Basic. 
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and r are everywhere positive fora < t < b. Let c, Co, d;, dz be real numbers 
such that c, and cz are not both 0 and d, and d are not both 0. Finally for 
each complex number A, let (SL) be the following set of conditions on a function 
u: [a,b] > C with two continuous derivatives: 


(p(t)u’)Y — q(t)u + Ar(t)u = 0, (SL1) 
cju(a) + cou/(a) =O and dyu(b) + dou'(b) = 0. (SL2) 


Then the system (SL) has a nonzero solution for acountably infinite set of values of 
i. If E denotes this set of values, then the members A of E are all real, they have no 
limit point in R, and the vector space of solutions of (SL) is 1-dimensional for each 
such 2. The set E is bounded below if c)c2 < 0 and djd> > 0, and E is bounded 
below by 0 if these conditions and the condition g > 0 are all satisfied. In any 
case, enumerate F as 41, A2,..., let u = gy, be a nonzero solution of (SL) when 
A= An define (f, 8), =f? fOr dt and |i f ll, = (f2 FOPrO adr)” 
for continuous f and g, and normalize ¢, so that ||¢,||, = 1. Then (@n, Gm), = 9 
for m # n, and the functions @, satisfy the following completeness conditions: 


(a) any u having two continuous derivatives on [a, b] and satisfying (SL2) 
has the property that the series }°°° | (u, n),Gn(t) converges absolutely 
uniformly to u(t) on [a, b], 

(b) the only continuous ¢ on [a, b] with (¢, ¢,), = 0 for alln is gy = 0, 

(c) any continuous ¢ on [a, b] satisfies ales = peer l(g, ia): | 


REMARK. The expression converges absolutely uniformly in (a) means that 
oes |(U, Pn), Pn(t)| converges uniformly. 


EXAMPLE. The prototype for Theorem 1.3 is the constant-coefficient case 
p =r =landg = 0. The equation (SL1) is just vu” +Au = 0. If A happens to be 
> 0, then the solutions are u(t) = C, cos pt + Cp sin pt, where A = p?. Suppose 
[a, b] = [0, 2]. The condition c;u(0) + cou'(0) = 0 says that cyC; + pc2C2 = 0 
and forces a linear relationship between C, and C2 that depends on p. The 
condition dju(z) + dou’() = 0 gives a further such relationship. These two 
conditions may or may not be compatible. An especially simple special case is 
that cy = dz = 0, so that (SL2) requires u(0) = u(x) = 0. From u(O) = 0, 
we get C; = 0, and then u(z) = 0 forces sin px = 0 if u is to be a nonzero 
solution. Thus p must be an integer. It may be checked that A < 0 leads to no 
nonzero solutions if c2 = dz; = 0. Part (a) of the theorem therefore says that any 
twice continuously differentiable function u(t) on [0, 7] vanishing at 0 and z 
has an expansion u(t) = Bee b, sin pt, the series being absolutely uniformly 
convergent. 


The first partial differential equation that we consider is the heat equation 
u,; = Au, and we are interested in real-valued solutions. 
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EXAMPLES WITH THE HEAT EQUATION. 


(1) We suppose that there is a single space variable x and that the set in 
1-dimensional space is a rod 0 < x </. The unknown function is u(x, ft), and 
the boundary data are 


u(x,0) = f(x) (initial temperature equal to f(x)), 
u(0O,t) =u(l, t) =0 (ends of rod at absolute 0 temperature for all t > 0). 


Heat flows in the rod for t > 0, and we want to know what happens. The 
equation for the heat flow is u; = u,,, and we search for solutions of the form 
u(x,t) = X(x)T(t). Unless T(t) is identically 0, the boundary data force 
X(x)T(O) = f(x) and X(0) = X(J) = 0. Substitution into the heat equation 
gives 

X()T'(t) = X"(x) T(t). 


We divide by X (x)T (t) and obtain 


TG) XG) 
T(t) X(x)" 


A function of ¢ alone can equal a function of x alone only if it is constant, and 
thus 

T’(t) = X"(x) 

————— = 

T(t) X() 


for some real constant c. The bound variable is x, and we hope that the possible 
values of c lie in a discrete set. Suppose that c is > 0, so that c = p* with p > 0. 
The equation X”(x)/X (x) = p? would say that X (x) = cye?* + ce7?*. From 
X (0) = 0, we get cz = —c,, so that X (x) = c,(e?* — e-?*). Since e?* — e-?* 
is strictly increasing, c,(e?* — e~”*) = 0 is impossible unless cy = 0. Thus we 
must have c < 0. Similarly c = 0 is impossible, and the conclusion is that c < 0. 
We write c = —p* with p > 0. The equation is X”(x) = —p?X(x), and then 
X (x) = c1 cos px + co sin px. The condition X (0) = O says c; = 0, and the 
condition X (J) = 0 then says that p = nz/I for some integer n. Thus 


X (x) = sin(nzx/I), 
up to a multiplicative constant. The t equation becomes T’(t) = —p?T = 


—(nz/1)?T (t), and hence 
T(t) = ene 
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up to a multiplicative constant. Our product solution is then a multiple of 
eo "/"t sin (nex /1), and the form of solution we expect for the boundary-value 
problem is therefore 


(oe) 
u(x,t) = yc,e me sin(nx/1). 


n=1 


The constants c, are determined by the condition at t = 0. We extend f(x), 
which is initially defined for 0 < x < /, to be defined for —/ < x </ and to be 
an odd function. The constants c, are then the Fourier coefficients of f except 
that the period is 2/ rather than 277: 


foe) 
f(x) ~ Yiensin withe, = 4!) f(y) sin™® dy = 2 fi f(y) sin dy. 


n=1 


Normally the Fourier series would have cosine terms as well as sine terms, but the 
cosine terms all have coefficient 0 since f is odd. In any event, we now have an 
explicit infinite series that we hope gives the desired solution u(x, t). Checking 
that the function u(x, ft) defined above is indeed the desired solution amounts 
to handling steps (iii) and (iv) in the method of separation of variables. For 
(iii), we want to know whether f(x) really can be represented in the indicated 
form. This example is simple enough that (iii) can be handled by the theory 
of Fourier series as in Chapter I of Basic: since f is assumed to have two 
continuous derivatives on [0,/], the Fourier series converges uniformly by the 
Weierstrass M test, and the sum must be f by the uniqueness theorem. Another 
way of handling (iii) is to apply Theorem 1.3 to the equation y” + Ay = 0 
subject to the conditions y(0) = 0 and y(/) = 0: The theorem gives us a certain 
unique abstract expansion without giving us formulas for the explicit functions 
that are involved. It says also that we have completeness and absolute uniform 
convergence. Since our explicit expansion with sines satisfies the requirements 
of the unique abstract expansion, it must agree with the abstract expansion and 
it must converge absolutely uniformly. Whichever approach we use, the result 
is that we have now handled (iii). Step (iv) in the method is the justification 
that u(x, t) has all the required properties: we have to check that the function in 
question solves the heat equation and takes on the asserted boundary values. The 
function in question satisfies the heat equation because of Theorem 1.1 and the 
rapid convergence of the series }°°°_, e~(""/! t and its first and second derivatives. 
The question about boundary values is completely settled by Proposition 1.2. For 
the condition u(x,0) = f(x), we take X = [0,/], Y = [0,+00), y = f, 
An(x) = c,sin(nax/1), Bat) = enn lt and yo = 0 in the proposition; 
uniform convergence of }> | A,,(x)| follows either from Theorem 1.3 or from the 
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Fourier-series estimate |c,| < C/n, which in turn follows from the assumption 
that f has two continuous derivatives. The conditions u(0, t) = u(/, t) = 0 may 
be verified in the same way by reversing the roles of the space variable and the 
time variable. To check that u(0,t) = 0, for example, we use Proposition 1.2 
with X = (6,+00), Y = [0,/], and yo = 0. Our boundary-value problem is 
therefore now completely solved. 

(2) We continue to assume that space is 1-dimensional and that the object of 
interest is a rod 0 < x </. The unknown function for heat flow in the rod is still 
u(x,t), but this time the boundary data are 


u(x,0) = f(x) (initial temperature equal to f (x)), 
u,x(0,t) =u,(l,t) =0 (ends of rod perfectly insulated for all t > 0). 


In the same way as in Example 1, a product solution X (x)T (t) leads to a separated 
equation T'(t)/T (t) = X"(x)/X (x), and both sides must be some constant —/. 
The equation for X (x) is then 


X" +aX =0 with X'(0) = X’(/) =0. 


We find that A has to be of the form p* with p = nz/I for some integer n > 0, 
and X (x) has to be a multiple of cos(nmx/1). Taking into account the formula 
2 = p*, we see that the equation for T(t) is 


T(t) = —p°T(t). 


Then T (t) has to be a multiple of e~*/”"', and our product solution is a multiple 
of e~ ""/)"t cos(nzx/1). The form of solution we expect for the boundary-value 
problem is therefore 


(oe) 
u(x,t) = Ss ce! cos(nzx/1). 
n=0 
We determine the coefficients c, by using the initial condition u(x,0) = f(x), 
and thus we want to represent f(x) by a series of cosines: 


(oe) 
f@~ a cos “7. 
n=0 


We can do so by extending f(x) from [0, /] to [—/, /] so as to be even and using 
ordinary Fourier coefficients. The formula is therefore c, = 4 Ah f () cos a“ dy 


forn > 0, with co = t i: f() dy. Again as in Example 1, we can carry out step 
(iii) of the method either by using the theory of Fourier series or by appealing 
to Theorem 1.3. In step (iv), we can again use Theorem 1.1 to see that the 
prospective function u(x, t) satisfies the heat equation, and the boundary-value 
conditions can be checked with the aid of Proposition 1.2. 
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(3) We still assume that space is 1-dimensional and that the object of interest 
isarod0 < x </. The unknown function for heat flow in the rod is still u(x, ft), 
but this time the boundary data are 


u(x,0) = f(x) (initial temperature equal to f (x)), 
u(0O,t) =0 (one end of rod held at temperature 0), 
u,(1, t) = —hu(I, t) (other end radiating into a medium of temperature 0), 


and h is assumed positive. In the same way as in Example 1, a product solution 
X (x)T (t) leads to a separated equation T’(t)/T(t) = X’(x)/X (x), and both 
sides must be some constant —A. The equation for X (x) is then 


x(0) =0, 


xX" +2rX =0 with | 
hX (1) + X'(l) = 0. 


From the equation X” + 7X = 0 and the condition X (0) = 0, X (x) has to be 
a multiple of sinh px with 4 = —p* < 0, or of x with A = 0, or of sin px with 
X = p* > 0. In the first two cases, hX (J) + X’(1) equals h sinh pl + pcosh pl 
or hil + 1 and cannot be 0. Thus we must have A = p? > 0, and X(x) isa 
multiple of sin px. The condition hX (J) + X'(/) = 0 then holds if and only if 
hsin pl + pcos pl = 0. This equation has infinitely many positive solutions p, 
and we write them as pj, p2,... . See Figure 1.1 for what happens when / = z. 


-10 


FIGURE 1.1. Graphs of sin zp and —p cos zp. The graphs 
intersect for infinitely many values of +p. 


Ifa = pe then the equation for T(t) is T’(t) = —p?T(t), and T(t) has to be a 
multiple of e~Pn'. Thus our product solution is a multiple of e7 Pat sin Pnx, and 
the form of solution we expect for the boundary-value problem is 


oo apts 
u(x,t) = ) cne P" SIN Ppx. 


n=1 
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Putting t = 0, we see that we want to choose constants c, such that 
[o.e) 
f@~ Cy SIN PyX. 
n=1 


There is no reason why the numbers p, should form an arithmetic progression, 
and such an expansion is not a result in the subject of Fourier series. To handle 
step (iii), this time we appeal to Theorem 1.3. That theorem points out the 
remarkable fact that the functions sin p,x satisfy the orthogonality property 
fi SIN PyX SIN Px dx = 0 if n ~ m and therefore that 


l 1 
n= f fosin payay | f sin* pny dy . 
0 0 


Even more remarkably, the theorem gives us a completeness result and a conver- 
gence result. Thus (iii) is completely finished. In step (iv), we use Theorem 1.1 to 
check that u(x, ft) satisfies the partial differential equation, just as in Examples 1 
and 2. The same technique as in Examples 1 and 2 with Proposition 1.2 works to 
recover the boundary value u(x, 0) as a limit; this time we use Theorem 1 .3 for the 
absolute uniform convergence in the x variable. For u(0, t), one new comment 
is appropriate: we take X = (6,+00), Y = [0,1], yo = 0, An(x) = ew Pit, and 
Bry) = Cn Sin ppx; although the estimate |B,(y)| < 1 may not be valid for 
all n, it is valid for n sufficiently large because of the uniform convergence of 
Seg silpax. 

4) This time we assume that space is 2-dimensional and that the object of 
interest is a circular plate. The unknown function for heat flow in the plate is 
u(x, y, ¢), the differential equation is u; = ux; + uUyy, and the assumptions about 
boundary data are that the temperature distribution is known on the plate at t = 0 
and that the edge of the plate is held at temperature 0 for all t > 0. Let us use polar 
coordinates (r, @) in the (x, y) plane, let us assume that the plate is described by 
r < 1, and let us write the unknown function as u(r, 6, t) = u(r cos 6,r sing, f). 
The heat equation becomes 


-1 —2 
Up = Upp +r OU +r “Ve9, 


and the boundary data are given by 


u(r, 6,0) = f(r, 9) (initial temperature equal to f(r, @)), 
v(1,0,t) =0 (edge of plate held at temperature 0). 
We first look for solutions of the heat equation of the form R(r)O(@)T (ft). 
Substitution and division by R(r)@(@)T (t) gives 
R’(r) 1 R'(r) dl ©” (0) 7 T'(t) a 


Rr) r Rr) r2 00) Tt) — 


’ 
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so that T(t) is a multiple of e~“. The equation relating R, ©, and c becomes 


r?R"(r) rR'(r) 0” (0) _ > 
Rr) | RG) @@) 
Therefore 
JA a ge ok eS 
@(@) ——~—~—*R(r) Rr) 


Since ©(6) has to be periodic of period 277, we must have A = n* with n an 
integer > 0; then O(@) = c1 cosnd + cz sinn@. The equation for R(r) becomes 


rR" +rR' + (cr? —n?)R =0. 


This has a regular singular point at r = 0, and the indicial equation is s? = n’. 


Thus s = -En. In fact, we can recognize this equation as Bessel’s equation of order 
n by achange of variables: A little argument excludes c < 0. Putting k = /c, 
p = kr,and y(e) = R(r) leads to y” + p~!y’ + (1 — n?p~*)y = 0, which is 
exactly Bessel’s equation of order n. Transforming the solution y(o) = J;,(e) 
back withr = k~!p, we see that R(r) = y(~) = Jn(e) = Jn (kr) is a solution of 
the equation for R. A basic product solution is therefore 5 d0,k Jo(kr) ifn = 0 or 


In(kr)(dn,~ COS NO + by, sinnO)ye*" 
ifn > 0. The index n has to be an integer in order for v to be well behaved at the 
center, or origin, of the plate, but we have not thus far restricted k to a discrete 
set. However, the condition of temperature 0 at r = 1 means that J,,(k) has to be 
0, and the zeros of J,, form a discrete set. The given condition at t = 0 means 
that we want 


CO 

F.O~E YS angdo(kr)+>~ ( Y> (an. C080 +p, Sin 10) In(kr)). 

k>0 with n=l k>Owith 

Jo(kr)=0 Jn(kr)=0 
We do not have the tools to establish this kind of relation, but we can see a hint 
of what to do. The orthogonality conditions that allow us to write candidates for 
the coefficients are the usual orthogonality for trigonometric functions and the 
relation 


1 
i In(kr)In(kr)rdr=0 — if Jn(k) = J,(k’) =Oandk £ k’. 
0 


The latter is not quite a consequence of Theorem 1.3, but it is close since the 
equation satisfied by y,(r) = J, (kr), namely 


(ry)! + ker =n?" )ye = ry + yy + er = 77!) yn = 0, 


2. Separation of Variables 13 


fails to be of the form in Theorem 1.3 only because of trouble at the endpoint 
r = 0 of the domain interval. In fact, the argument in the next section for the 
orthogonality in Theorem 1.3 will work also in this case; see Problem 2 at the 
end of the chapter. Thus put 


ae 4 


1 1 
ar) = — f(,@)cosnO6d@ and b,(r)= -| f(r, 9) sinné dé, 
IW a 


—T —T 


so that 


f7r,@)~ Say(r) + om (a, (r) cosné + b,(r) sinné) for each r. 


n=1 


1 1 
Then put ank = / An(r) yx (rr dr /| ye(r)°r dr 
0 0 


1 1 
and bn k -| bi(r)yx¢(r)r dr /| ye(r)°r dr. 
0 0 


With these values in place, handling step (111) amounts to showing that 


[o.e) 
fr) = 3 D> aoxsJotkr) + Y (Yn c08nd + by,x sin n8)Jutkr)) 
k>0 with n=1 — k>0 with 
Jo(kr)=0 JIn(kr)=0 


for functions f of class C*. This formula is valid, but we would need a result 
from Sturm—Liouville theory that is different from Theorem 1.3 in order to prove 
it. Step (iv) is to use the convergence from Sturm—Liouville theory, together with 
application of Proposition 1.2 and Theorem 1.1, to see that the function u(r, 6, ft) 
given by 


1 (oe) 

= DS aon dolere + > (S24 60826 + Bn x sinn6) Jn kre") 
k>0 with n=1 — k>O with 
Jo(kr)=0 Jn(kr)=0 


has all the required properties. 


The second partial differential equation that we consider is the Laplace 
equation Au = 0. Various sets of boundary data can be given, but we deal 
only with the values of u on the edge of its bounded domain of definition. In this 
case the problem of finding u is known as the Dirichlet problem. 
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EXAMPLES WITH LAPLACE EQUATION. 


(1) We suppose that the space domain is the unit disk in R?. The Laplace 
equation in polar coordinates (r, @) is uy + r~!u, +r-*ugg = 0. The unknown 
function is u(r, #), and the given boundary values of u for the Dirichlet problem 
are 

u(1,0) = f(@) (value on unit circle). 


It is implicit that u(r, 0) is to be periodic of period 27 in @ and is to be well 
behaved at r = 0. A product solution is of the form R(r)O(@). We substitute 
into the equation, divide by r ~? R(r)@(@), and and find that the variables separate 
- r2R! rR’ @e” 
R RE O-| 

The equation for © is ©” + c© = 0, and the solution is required to be periodic. 
We might be tempted to try to apply Theorem 1.3 at this stage, but the boundary 
condition of periodicity, ©(—z) = ©(z), is not exactly of the right kind for 
Theorem 1.3. Fortunately we can handle matters directly, using Fourier series 
in the analysis. The periodicity forces c = n? with n an integer > 0. Then 
@(@) = c,cosn@ + c2sinnO, except that the sine term is not needed when 
n = 0. The equation for R becomes 


r?R" +rR'—n?R=0. 
This is an Euler equation with indicial equation s* = n7, and hence s = +n. We 
discard —n withn > | because the solutionr~” is not well behaved atr = 0, and 
we discard also the second solution logr that goes with n = 0. Consequently 
R(r) is a multiple of r”, and the product solution is r”(a, cosné + b, sinn@) 
when n > 0. The expected solution of the Laplace equation is then 


CO 
u(r, 0) = Sao + >> r"(a,cosnd + by sinné). 


n=1 


We determine a, and b, by formally putting r = 1, and we see that a, and 
b, are to be the ordinary Fourier coefficients of f(x). The normal assumption 
for a boundary-value problem is that f is as nice a function as u and hence 
has two continuous derivatives. In this case we know that the Fourier series 
converges to f(x) uniformly. It is immediate from Theorem 1.1 that u(r, 0) 
satisfies Laplace’s equation for r < 1, and Proposition 1.2 shows that u(r, 6) has 
the desired boundary values. This completes the solution of the boundary-value 
problem. In this example the solution u(r, @) is given by a nice integral formula: 
The same easy computation that expresses the partial sums of a Fourier series in 
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terms of the Dirichlet kernel allows us to write u(r, 0) in terms of the Poisson 


kernel 
1—r? a 


P.(0) = [=p aaa = oS pial gin? 


n=—OO 


namely 


u(r, 0) = ys rl ‘a figyenine dy)e 


n=— 


=—00 


1 us 
= 5s f(g) PO — p)dy 
an ae 


4 


1 
Pte f(O— ov) P(g) dg. 
an ae 


The interchange of integral and sum for the second equality is valid because of the 
uniform convergence of the series )°°°_ re") for fixed r. The resulting 
formula for u(r, 0) is known as the Poisson integral formula for the unit disk. 


(2) We suppose that the space domain is the unit ball in R?. The Laplace 
equation in spherical coordinates (r, g, 8), with g measuring latitude from the 
point (x, y, z) = (0,0, 1), is 


1 ; 1 
(r°u,), + —— (sin g)uy)yp + —>— oo = 0. 
sin @ sin” @ 


The unknown function is u(r, g, 8), and the given boundary values of u for the 
Dirichlet problem are 


u(1,¢,0) = f(g, @) (value on unit sphere). 


The function u is to be periodic in 6 and is to be well behaved atr = 0, = 0, and 
y = 7. Searching for a solution R(r) ®(y)@ (GO) leads to the separated equation 


r2R" + 2r R’ ©” + (cot gy)’ 1 ©’ 
= i 6 
R ® sin’ g © 


The resulting equation for R isr?R” +2r R’—cR = 0, which is an Euler equation 
whose indicial equation has roots s satisfying s(s + 1) = c. The condition that a 
solution of the Laplace equation be well behaved at r = 0 means that the solution 
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r* must have s equal to an integer m > 0. Then R(r) is a multiple of r” with m 
an integer > 0 and with c = m(m + 1). The equation involving ® and © is then 


oD” t oD’ @e” 
(sin? ) ie OF 4 tml + 1) sin? y = 0. 


This equation shows that ©”/@ = c’,and as usual we obtain c’ = —n? with n an 
integer > 0. Then O(@) = c1 cosn@ + co sinné. Substituting into the equation 
for ® yields 


©” t oD’ 
(sin’ g) a e #) n> +m(m+1)sin? yg =0. 


We make the change of variables t = cos g, which has 


: ing = = (cosy) & + (sin? y) . 
—- > —S1n: = an —, = —(COSs = sin = 
dg ? it dg Yt ” ae 


Putting P(t) = P(cosg) = P(g) for 0 < g < z leads to 


1=2)P" =tP’ t p)(— sing) P’ 
(1 rs ) re ~)(— sin yg) Pine Dia 0 
and then to 
n2 
(1 —12)P" —21P’ 4 [mcm HD - = 3 |P =0. 


This is known as an associated Legendre equation. For n = 0, which is the 
case of a solution independent of longitude 6, the equation reduces to the ordinary 
Legendre equation.> Suppose for simplicity that f is independent of longitude 6 
and that we can take n = 0 in this equation. One solution of the equation for P is 
P(t) = P,,(t), the m™ Legendre polynomial. This is well behaved at t = +1, the 
values of ¢ that correspond to g = 0 and g = mz. Making a change of variables, 
we can see that the Legendre equation has regular singular points at t = 1 and 
t = —1. By examining the indicial equations at these points, we can see that 
there is only a 1-parameter family of solutions of the equation for P that are well 
behaved at tf = +1. Thus ®(¢) has to be a multiple of P,, (cos yg), and we are led 
to expect 


[o.e) 
u(r, 9,0) = )° mr” Pm (Cos g) 


m=0 


>The ordinary Legendre equation is (1 — 1?) P” — 2t P’ +m(m+ 1)P = 0, as in Section IV.8 of 
Basic. 
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for solutions that are independent of 6. If f(g, 0) is independent of 6, we 
determine c,, by the formula 


f (9.9) ~ >) cm Pm(Cos ¢). 
m=0 


The coefficients can be determined because the polynomials P,, are orthogonal 
under integration over [—1, 1]. To see this fact, we first rewrite the equation for 
P as (1 — t*)P’)' + m(m + 1)P = 0. This is almost of the form in Theorem 
1.3, but the coefficient 1 — ft? vanishes at the endpoints f = +1. Although the 
orthogonality does not then follow from Theorem 1.3, it may be proved in the 
same way as the orthogonality that is part of Theorem 1.3; see Problem 2 at the end 
of the chapter. A part of the completeness question is easily settled by observing 
that P,, is of degree m and that therefore the linear span of {Po, Pi,..., Pr} 
is the same as the linear span of {1,?¢,..., t’}. This much does not establish, 
however, that the series )> Cm Pm (t) converges uniformly. For that, we would need 
yet another result from Sturm—Liouville theory or elsewhere. Once the uniform 
convergence has been established, step (iv) can be handled in the usual way. 


The third and final partial differential equation that we consider is the wave 
equation u;; = Au. We consider examples of boundary-value problems in one 
and two space variables. 


EXAMPLES WITH WAVE EQUATION. 

(1) A string on the x axis under tension is such that each point can be displaced 
only in the y direction. Let y = u(x, t) be the displacement. The equation for 
the unknown function u(x,t) in suitable physical units is uj; = u xx, and the 
boundary data are 


u(x,0) = f(x) (initial displacement), 
u;(x,0) = g(x) (initial velocity), 
u(0,t) =u, t) =0 (ends of string fixed for all t > 0). 


The string vibrates for f > 0, and we want to know what happens. Searching 
for basic product solutions X (x)T (t), we are led to T”/T = X"/X = constant. 
As usual the conditions at x = 0 and x = / force the constant to be nonpositive, 
necessarily —w* with w > 0. Then X(x) = cy coswx + cz sinwx. We obtain 
cy = 0 from X(0) = 0, and we obtain w = naz/Il, with n an integer, from 
X(J) = 0. Thus X(x) has to be a multiple of sin(uax//), and we may take 
n > Q. Examining the T equation, we are readily led to expect 


u(x,t) = SS sin(nax/1)[a, cos(nat/l) + by, sin(nat/1)]. 


n=1 


18 I. Introduction to Boundary- Value Problems 
The conditions u(x, 0) = f(x) and u;(x, 0) say that 


PO) ~ Yan sin("7*) and g(x) ~ ) (4) dn sin (“), 


n=1 


so that a, and nb,,/1 are coefficients in the Fourier sine series for f and g. Steps 
(iii) and (iv) in the method follow in the same way as in earlier examples. 


(2) We visualize a vibrating circular drum. A membrane in the (x, y) plane 
covers the unit disk and is under uniform tension. Each point can be displaced 
only in the z direction. Let u(x, y,t) = U(r, 8, t) be the displacement. The 
wave equation uj; = Ux» + Uyy becomes Uy, = U,, + r—'U, +r77Upe in polar 
coordinates. Assume for simplicity that the boundary data are 


U(r, 8,0) = f(r) (initial displacement independent of 6), 
U,(r, 0,0) =0 (initial velocity 0), 
U(1,96,t) =0 (edge of drum fixed for all t > 0). 


Because of the radial symmetry, let us look for basic product solutions of the 
form R(r)T(t). Substituting and separating variables, we are led to T’/T = 
(R” +r7!R’')/R = c. The equation for R is r7R” + rR’ — cr?R = 0, and 
the usual considerations do not determine the sign of c. The equation for R has 
a regular singular point at r = 0, but it is not an Euler equation. The indicial 
equation is s* = 0, with s = 0 as a root of multiplicity 2, independently of c. 
One solution is given by a power series in r, while another involves logr. We 
discard the solution with the logarithm because it would represent a singularity at 
the middle of the drum. To get at the sign of c, we use the condition R(1) = 0 and 
argue as follows: Without loss of generality, R(O) is positive. Suppose c > 0, 
and let r; < 1 be the first value of r > 0 where R(r;) = 0. From the equation 
r—'(rR’)’ = cR and the inequality R(r) > 0 for 0 < r < rj, we see that rR’ 
is strictly increasing for 0 < r < r,. Examining the power series expansion for 
R(r), we see that R’(0) = 0. Thus R’(r) > 0 forO <r <r,. But R(O) > 0 and 
R(r1) = 0 imply, by the Mean Value Theorem, that R’(r) is < 0 somewhere in 
between, and we have a contradiction. Similarly we rule out c = 0. We conclude 
that c is negative, i.e.,c = —k? with k > 0. The equation for R is then 


77R’ +7rR' +h7r?R=0. 
The change of variables o = kr reduces this equation to Bessel’s equation of order 


0, and the upshot is that R(r) is a multiple of Jo(kr). The condition R(1) = 0 
means that Jo(k) = 0. If k, is the n™ positive zero of Jo, then the T equation is 
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T’ + ker = 0, so that T(t) = c; cosk,t + cy sink,t. From U,(r, 6,0) = 0, we 
obtain cz = 0. Thus T(t) is a multiple of cos k,t, and we expect that 


U(r, 6,0) = Yen dJolknr) cos kt. 


n=1 


In step (iii), the determination of the c,,’s and the necessary analysis are similar to 
those in Example 4 for the heat equation, and it is not necessary to repeat them. 
Step (iv) is handled in much the same way as in the vibrating-string problem. 


3. Sturm-Liouville Theory 


The name “Sturm—Liouville theory” refers to the analysis of certain kinds of 
“eigenvalue” problems for linear ordinary differential equations, particularly 
equations of the second order. In this section we shall concentrate on one theorem 
of this kind, which was stated explicitly in Section 2 and was used as a tool for 
verifying that the method of separation of variables succeeded, for some examples, 
in solving a boundary-value problem for one of the standard partial differential 
equations. Before taking up this one theorem, however, let us make some general 
remarks about the setting, about “eigenvalues” and “eigenfunctions,” and about 
“self-adjointness.” 

Fix attention on an interval [a, b] and on second-order differential operators 
on this interval of the form L = P(t)D* + O(t)D + R(t)1 with D = d/dt,so 
that 

L(u) = P(t)u" + Q()u' + R(Q)u. 


We shall assume that the coefficient functions P, O, and R are real-valued; then 
L(a) = L(u). As was mentioned in Section 2, the behavior of all functions in 
question at the endpoints will be relevant to us: we say that a continuous function 
f : [a,b] > C with a derivative on (a, b) has a continuous derivative at one or 
both endpoints if f’ has a finite limit at the endpoint in question; it is equivalent 
to say that f extends to a larger set so as to be differentiable in an open interval 
about the endpoint and to have its derivative be continuous at the endpoint. 

An eigenvalue of the differential operator L is a complex number c such 
that L(u) = cu for some nonzero function u. Such a function u is called an 
eigenfunction. In practice we often have a particular nonvanishing function r 
and look for c such L(u) = cru for a nonzero u. In this case, c is an eigenvalue 
ofr tL: 

We introduce the inner-product space of complex-valued functions with two 


continuous derivatives on [a, b] and with (u, v) = Hk u(t)u(t) dt. Computation 
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using integration by parts and assuming suitable differentiability of the coeffi- 
cients gives 


b 
Lu.» = f (Pu + Qu’ + Ru)idt 
b 
3 / ((u")(P3) + (w)(Qd) + (u)(Ri)) dt 


b b 
= [ w'(Pay + w)(Q0)| = / (u'(Pv)' + (u)(Qv)' — (u)(Rv)) dt 


b 
= [wy v) + (u)(Qv) — (u)(P oy] 


b 
+ ((u)(Pv)” — (u)(QUy’ + (u)(Rv)) dt 
a 

b 
= (u, L*(v)) + [w)(P8) + (W)(Q) — w(PI)'] 

a 
where L*(v) = Pv” + (2P’— Q)v' + (P” — Q’ + R)v. The above computation 
shows that (L(u),v) = (u, L*(v)) if the integrated terms are ignored; this 
property is the abstract defining property of L*. The differential operator L* 
is called the formal adjoint of L. We shall be interested only in the situation 
in which L* = L, which we readily see happens if and only if P’ = Q; when 
L* = L, we say that L is formally self adjoint. If L is formally self adjoint, 
then substitution of Q = P’ shows that the above identity reduces to 


(Llu), v) — u, L(v)) = [Pyw'a 7 ww), 
a 
which is known as Green’s formula. 

Even when L as above is not formally self adjoint, it can be multiplied by a 
nonvanishing function, specifically ‘ : exp[(Q(s) — P’(s))/P(s)] ds, to become 
formally self adjoint. Thus formal self-adjointness by itself is no restriction on 
our second-order differential operator. 

In the formally self-adjoint case, one often rewrites P(t)D? + P’(t)D as 
D(P(t)D). With this understanding, let us rewrite our operator as 


L(u) = (p(t)u')’ — q(t)u 


and assume that p, p’, and q are continuous on [a, b] and that p(t) > 0 for 
a<t <b. We associate a Sturm-Liouville eigenvalue problem called (SL) 
to the set of data consisting of L, an everywhere-positive function r with two 
continuous derivatives on [a, b], and real numbers c1, cz, dj, do such that c, and 
c2 are not both 0 and d) and dp are not both 0. This is the problem of analyzing 
simultaneous solutions of 
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Layee Oa =o: (SL1) 
cju(a) + cou/(a) =O and dyu(b) + dou'(b) = 0, (SL2) 


for all values of A. 
Each condition (SL1) and (SL2) depends linearly on wu and uw’ if A is fixed, 
and thus the space of solutions of (SL) for fixed A is a vector space. We know® 


that the vector space of solutions of (SL1) alone is 2-dimensional; let uw; and u2 
form a basis of this vector space. The Wronskian matrix is € oe ) , and the 

1 2 
determinant of this matrix, namely 


uj (t)u5(t) — u}(t)u2(t), 


is nowhere 0. If w; and uz were both to satisfy the condition cju(a) + cou’ (a) = 0 


with c, and c> not both 0, then (:) would be a nontrivial solution of the matrix 


uj(a) ujla)\(ci\_ (0 
(i) tha) (3) =(0) 
and we would obtain the contradictory conclusion that the Wronskian matrix at a 
is singular. We conclude that the space of solutions of (SL) for fixed A is at most 
1-dimensional. 
Let (¢1, 92), = Ec vi (t)go(t) r(t) dt for any continuous functions g and @> 
on [a, b], and let || gi ||,, = ((¢1, ¢1),)!/2. The unsubscripted expressions (91, 2) 


and ||@|| will refer to (g1, gz), and ||g||, with r = 1. Then we can restate 
Theorem 1.3 as follows. 


equation 


Theorem 1.3’ (Sturm’s Theorem). The system (SL) has a nonzero solution 
for a countably infinite set of values of 4. If E denotes this set of values, then 
the members 4 of E are all real, they have no limit point in R, and the space of 
solutions of (SL) is 1-dimensional for each such 2. The set FE is bounded below 
if cycy < O and d,d, > 0, and E is bounded below by 0 if these conditions and 
the condition g > 0 are all satisfied. In any case, enumerate E in any fashion as 
A1,42,...,letu = g, be anonzero solution of (SL) when A = i,,, and normalize 
Gn SO that ||@p||, = 1. Then (@,, Gm), = 0 form ¥ n, and the functions 9, 
satisfy the following completeness conditions: 


(a) any u having two continuous derivatives on [a, b] and satisfying (SL2) 
has the property that the series yan (U, Yn),,Pn(t) converges absolutely 
uniformly to u(t) on [a, b], 

(b) the only continuous ¢ on [a, b] with (@, gn), = 0 for all n is g = 0, 

(c) any continuous ¢ on [a, b] satisfies |g]? = 0°", |, Gn), 7. 


°From Theorem 4.6 of Basic, for example. 
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REMARKS. In this section we shall reduce the proof of everything but (b) 
and (c) to the Hilbert-Schmidt Theorem, which will be proved in Chapter IL. 
Conclusions (b) and (c) follow from (a) and some elementary facts about Hilbert 
spaces, and we shall return to prove these two conclusions at the time of the 
Hilbert-Schmidt Theorem in Chapter II. 


PROOF EXCEPT FOR STEPS TO BE COMPLETED IN CHAPTER II. By way of 
preliminaries, let u and v be nonzero functions on [a, b] satisfying (SL2) and 
having two continuous derivatives. Green’s formula gives 


(Lw),v) — @, Lv) = [(p)w'5 — wd)? 
= p(b)(u'(byv) — u(b)v'@)) — p(a)(u'(a)v@) — u(ayw@). 


Condition (SL2) says that 


cju(a) + cou'(a) =O and cyv(a)+cov'(a) = 0. 
Since c; and c2 are real, these equations yield 
cju(a)u(a) + cou!(a)u(a) = 0 and cyu(a)u(a) + cou(a)v'(a) = 0, 


as well as 


cu(a)v'(a) + cou'(a)u'(a) = 0 and cu! (a)v(a) + cou’ (a)v'(a) = 0. 


Subtracting, for each of the above two displays, each second equation of a display 
from the first equation of the display, we obtain 


co(u'(a)v(a) — u(a)v'(a)) = 0 
and c1(u(a)u'(a) — u'(a)v(a)) = 0. 


Since c, and c2 are not both0, we conclude that p(a)(u'(a)u(a) —u(a)v'(a)) = 0. 
A similar computation starting from 


dju(b) + dou’ (b) =0 and d,v(b)+ dyv'(b) =0 


shows that p(b)(u'(b)v(b) — u(b)v'(b)) = 0. Consequently 
(L(u), v) — (u, L(v)) = 0 


whenever u and v are functions on [a, b] satisfying (SL2) and having two con- 
tinuous derivatives. 
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Now we can begin to establish the properties of the set EF of numbers A for which 
(SL) has a nonzero solution. Suppose that gy and gg, satisfy L(@y.) + Aur Gy = 0 
and L(g) + Agrgyg = 0. By what we have just seen, 


0 = (L(ga), 98) — (Ga, L(G8)) 


b b 
= / Codeed = / PT CPY 
a b a 
= (-dg + in [ Pa Pp rdt = (—hg + Ag) (Qa; ~p),- 


Taking ~, = @p in this computation shows that Ay = he: hence A, is real. With 
Aq and Ag real and unequal, this computation shows that (¢., gg), = 0. Thus 
the members of E are real, and the corresponding y’s are orthogonal. We have 
seen that the dimension of the space of solutions of (SL) corresponding to any 
member of F is 1-dimensional. 

We shall prove that F is at most countably infinite. Let c = ( fs r(t)dt 
Any continuous ¢ on [a, b] satisfies 


ye 


b 1/2 b 1/2 
iol, =(f lwPrardr)” = Csup iwern( [ rirdr)” = esup i 


a<t<b 


Consider the open ball B(k; gy) of radius k and center g in the space C ([a, b]) of 
continuous functions on [a, b]; the metric is given by the supremum of the absolute 
value of the difference of the functions. If y is in this ball, then sup |W — g| < k, 
c sup | — | < ck, and ||W —¢ll, < ck. Choose k with ck = s. Suppose that gy 
and gg correspond as above to unequal Ag and Ag and that gy and yp have been 
normalized so that ||¢a|l, = |l@gll, = 1. If w is in B(k; ga) M Bk; gg), then 
Iv —@all, < 5 and lv — all, < 5 The triangle inequality gives ||¢.—@g|l, < 1, 
whereas the orthogonality implies that 


ll~a — ¥pll? = a — Op, Gu — LB), 


= (Yas Pay — (Pas 9B) — (Pp, Pu) + (Pp, OB), 


The existence of yw thus leads us to a contradiction, and we conclude that B(k; gy) 
and B(k; gg) are disjoint. Since [a, b] is a compact metric space, C ([a, b]) is sep- 
arable as a metric space,’ and hence so is the metric subspace S = LJ, B(k; Yu). 
The collection of all B(k; @,,) is an open cover of S, and the separability gives us 


By Corollary 2.59 of Basic. 
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a countable subcover. Since the sets B(k; gy ) are disjoint, we conclude that the 
set of all gy is countable. Hence E is at most countably infinite. 

The next step is to bound F below under additional hypotheses as in the 
statement of the theorem. Let A be in E, and let gy be a nonzero solution of (SL) 
corresponding to A and normalized so that ||g||, = 1. Multiplying (SL1) by ¢ 
and integrating, we have 


b b b 
a= | aol ar =— [ wo'year + | qlol? dt 
7-7) 112 e s4 
=-[po'a}, + | pl¢g'| arf q\e\ dt 
b 
> —p(b)g'(b)g(b) + playy'(a)e(a) + fl (lePr)(r—'q) at 
= —pib)g'(b) pb) + plage ag@ + ink (rO'qO}. 


Let us show under the hypotheses cjc2 < 0 and d,dz > O that g'(a)p(a) > 0 
and y'(b)y(b) < 0, and then the asserted lower bounds will follow. Condition 
(SL2) gives us c)g(a) + cog'(a) = 0. If cy = 0 or cp = O, then g’(a) = 0 
or y(a) = 0, and hence g’(a)g(a) => 0. If cicz 4 0, then cic. < 0. The 
identity c;g(a) + cog’(a) = 0 implies that crlg(a)|? + cjceoy'(a)g(a) = 0 and 
hence —c)c29'(a)g(a) = crlg(a)|? > 0. Because of the condition cjc2 < 0, 


we conclude that g’(a)y(a) => 0. A similar argument using djd, > O and 
d,p(b) + d,y'(b) = 0 shows that g’(b)y(b) < 0. This completes the verification 
of the lower bounds for i. 

We have therefore established all the results in the theorem that are to be proved 
at this time except for 


(i) the existence of a countably infinite set of 2 for which (SL) has a nonzero 
solution, 
(ii) the fact that E has no limit point in R, 
(iii) the assertion (a) about completeness. 


Before carrying out these steps, we may need to adjust L slightly. We are studying 
functions u satisfying L(u) + Aru = 0 and (SL2), and we have established that 
the set FE of A for which there is a nonzero solution is at most countably infinite. 
Choose a member Ag of the complementary set E° and rewrite the differential 
equation as M(u) + vru = 0, where M(u) = L(u) + Aoru and v = (A — Ao). 
Then M has properties similar to those of L, and it has the further property that 0 
is not a value of v for which M(u) + vru = 0 and (SL2) together have a nonzero 
solution. It would be enough to prove (i), (ii), and (iii) for M(u) + vru = 0 and 
(SL2). Adjusting notation, we may assume from the outset that 0 is not in E. 
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The next step is to prove the existence of a continuous real-valued function 
G,(t, s) on [a, b] x [a, b] such that G;(t, 5) = G(s, t), such that the operator 
T; given by 


b 
Tf) =H. Gilt, s) f(s 


carries the space C[a, b] of continuous functions f on [a, b] one-one onto the 
space D[a, b] of functions u on [a, b] satisfying (SL2) and having two continuous 
derivatives on [a, b], and such that L : D[a, b] + C[a, b] is a two-sided inverse 
function to T;. The existence will be proved by an explicit construction that will 
be carried out as a lemma at the end of this section. The function G,(t, s) is called 
a Green’s function for the operator L subject to the conditions (SL2). Assuming 
that a Green’s function indeed exists, we next apply the Hilbert-Schmidt Theorem 
of Chapter II in the following form: 


SPECIAL CASE OF HILBERT-SCHMIDT THEOREM. Let G(t,s) be a 
continuous complex-valued function on [a,b] x [a,b] such that 
G(t, s) = G(s, t), and define 


b 
T f(t) = Gt.) fWas 


from the space C[a, b] of continuous functions on [a, b] to itself. 
Define an inner product (f, g) = fi f (t)g(t) dt and its corresponding 
norm || - || on C[a, b]. For each complex uw + 0, define 


Vi= LF : [a,b] > or f is continuous and T(f) = uf}. 


Then each V,, is finite dimensional, the space V, 4 0 is nonzero 
for only countably many jz, the y’s with V,, € 0 are all real, and 
for any € > O, there are only finitely many w with V, ~€ 0 
and |j4| => €. The spaces V,, are mutually orthogonal with respect 
to the inner product (f, g), and the continuous functions orthogonal 
to all V,, are the continuous functions A with T(h) = 0. Let vy, v2,... 
be an enumeration of the union of orthogonal bases of the spaces V,, 
with ||v;|| = 1 for all 7. Then for any continuous f on [a, 5], 


T(f)@) = D> (Tf). Un) ont), 


n=1 


the series on the right side being absolutely uniformly convergent. 
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The theorem is applied not to our Green’s function G, and the operator T; as 
above but to 


G(t,s) =r(t)!/Gy(t, s)r(s) 1? 


b 
and rf = | Git, s)f(s)ds =r(t)'? T(r” fy). 


If T(f) = wf for areal number  ¥ 0, then we have Ty(r!/* f) = pra! f. 
Application of L gives r'/*f = wL(r'/*f). If we put u = r—'/? f, then 
we obtain wL(u) = r!/? f = r(r7'/? f) = ru. Hence L(u) + Aru = 0 for 
X = —p!. Also, the equation u = r7!/? f = wo'!T,(r!/? f) exhibits u as in 
the image of 7; and shows that u satisfies (SL2). Conversely if L(u) + Aru = 0 
and wu satisfies (SL2), recall that we arranged that 0 is not in F, so that 4 has a 
reciprocal. Define f = r'/*u. Application of T, to L(u) + Aru = 0 gives 0 = 
uU+AT (ru) =r? f +AT (7! f). Then T(f) = r!?T% 01” f) = —a7' f. 
We conclude that the correspondence f = r!/2u exactly identifies the vector 
subspace of functions u in D[a, b] satisfying L(u) + Aru = O with the vector 
subspace of functions f in C[a, b] satisfying T(f) = —A7' f. 

The statement of Sturm’s Theorem gives us an enumeration Aj, A2,... of E. 
We know for each A = i, that the space of functions u solving (SL) for A = A, 
in E is 1-dimensional, and the statement of Sturm’s Theorem has selected for 
us a function vu = @, solving (SL) such that ||g,||, = 1. Define v, = r'/*9, 
and [ly = —Arl, so that T(v,) = Lnvy and |lvp|| = ||Gnl|, = 1. Because of 
the correspondence of jz’s and i’s, the v, may be taken as the complete list of 
vectors specified in the Hilbert-Schmidt Theorem. Since the ¢g,’s are orthogonal 
for (-, -),, the v,’s are orthogonal for (-, -). 

The operator 7; has 0 kernel on C[a, b], being invertible, and the formula 
for T in terms of 7; shows therefore that T has 0 kernel. Thus the sequence 
[1, 2,... 1S infinite, and the Hilbert-Schmidt Theorem shows that it tends to 
0. The corresponding sequence A,, A2,... of negative reciprocals is then infinite 
and has no finite limit point. This proves results (i) and (ii) announced above. 

Let u have two continuous derivatives on [a, b] and satisfy (SL2). Then u is in 
the image of T,. Write u = T,(f) with f continuous, and put g = r~!/? f. Then 
u =T(f) =r7!P T(r“? f) =r T(g) and (u, Yn), = (Tg), Un). Hence 


r(t)'?u(t) = T(g)(t) 
and r(t)?(u, Gn), Pn(t) = (L(g), Un)Un(t). 


The Hilbert-Schmidt Theorem tells us that the series Sci es Un) Un (t) 
converges absolutely uniformly to T(g)(t). Because r(t)!/* is bounded above 
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and below by positive constants, it follows that the series ae (U, Pn), Pn(t) 
converges absolutely uniformly to u(t). This proves result (iii), i.e., the com- 
pleteness assertion (a) in the statement of Sturm’s Theorem, and we are done 
for now except for the proof of the existence of the Green’s function G1. 


Lemma 1.4. Under the assumption that there is no nonzero solution of (SL) for 
i = 0, there exists a continuous real-valued function G;(t, 5) on [a, b] x [a, b] 
such that Gi(t, s) = G(s, t), such that the operator 7; given by 


b 
T, f (t) = G(t, s) f(s) ds 


carries the space C[a, b] of continuous functions f on [a, b] one-one onto the 
space D[a, b] of functions u on [a, b] satisfying (SL2) and having two continuous 
derivatives on [a, b], and such that L : D[a, b] > C[a, b] is a two-sided inverse 
function to 7}. 


PROOF. Since L(u) = pu” + p’u' — qu, a solution of L(u) = 0 has u” = 


-—ly 


—p'p'u' + p~'qu. Fixa point c in [a, b]. Let g,(t) and g(t) be the unique 
solutions of L(u) = 0 on [a, bd] satisfying 


yi(c) =1 and y\(c)=0, — gp(c) =0 and gi(c) = 1. 


Since the complex conjugate of @, or @2 satisfies the same conditions, we must 
have ¢; = ¢, and @2 = @. Hence gy, and q) are real-valued. The associated 
Wronskian matrix is 


W(¢1.@)(t) = Ge a) 


g(t) y(t) 
and its determinant is 
det W(91, G2)(t) = Pi GA (1) — 9 ()g2(t). 


Then det W(¢1, ~2)(c) = 1 and det W(¢@, @2)(t) satisfies the first-order linear 
homogeneous differential equation 


(det W(¢1, ¢2))' = 9193 — 9/2 
= gi(—p 'p'g + p'qe2) — o2(—p 'p'g, + Pag) 
= —p'p' (19) — 9192) 
= —p'p' det W(Q1, 92). 
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Therefore 


det W(y1, g2)(t) = exp (— J! p'(s)/p(s) ds) = exp ( — log p(t) + log p(c)) 
= exp(log(p(c)/p(t))) = p(c)/p(t). 


For f continuous, consider the solutions of the equation L(u) = f. A specific 

solution is given by variation of parameters, as stated in Theorem 4.9 of Basic. 

To use the formula in that theorem, we need L to have leading coefficient 1. For 
med ers 


that purpose, we rewrite L(w) = f as u” + p7!p'u' — p~'qu = p7'f. The 
theorem shows that one solution u*(t) is given by the first entry of 


t Be 0 
Weert f WGI, P2)(5) Chee) ca 


Since W (91, ¢2)(s)~! = (det W(g1, g2)(s))7! (2. “etic result is 
‘(5 : 


. =f —91(t)p2(s)p~'(8) f(s) + p(t gi(s) Pp! (9) f(s) 
u(t) = ds 
a P(c)/P(s) 


=poy i (— gi(t)gr(s) + g2(t)g1(s)) f(s) ds. 
Define 


Gi) | pc) (-eitor(s) + es) ifs <t, 


ifs >t. 


This function is continuous everywhere on [a, b] x [a, b], including where s = f, 
and it has been constructed so that 


t b 
w= [ Go(t.if)ds = f Go(t, s) f(s) ds 


—-ly/,,/ 


is a solution of uw” + p~!p'u' — p~'qu = p7'f,ie.,of L(u) = f. In particular, 
the form of the equation shows that u* has two continuous derivatives on [a, b]. 
Therefore the operator 


b 
To( f(t) = it Golt, 8) f(s) ds 


carries C[a, b] into the space of twice continuously differentiable functions on 


[a, b]. 
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The final step is to adjust Go and 7p so that the operator produces twice 
continuously differentiable functions satisfying (SL2). Fix f continuous, and 
let u*(t) = ie Go(t, s) f(s) ds. By assumption the equation L(u) = 0 has no 
nonzero solution that satisfies (SL2). Thus the function g(t) = x19 (t) + x2@2(t) 
does not have both 


cip(a) +erg'(a) =0 and dyg(b) + dog'(b) = 0 
unless x, and x2 are both 0. In other words the homogeneous system of equations 
Ge (a) + erg) (a) cigp(a) + 029(a) ) € ) a @ 
digi(b) + drgi(b)  dig2(b) + dogi(b) } \ x2 0 
has only the trivial solution. Consequently the system given by 


ce (a)+c2g\ (a)  cig2(a) + c295(a) ) é ) 
d\p\(b) + dogi(b)  dig2(b) + dogi(b) ) \ ka 


_ _ ( ciu*(a) + c2u*'(a) 
— diu*(b) + dou*’(b) 


(*) 


has a unique solution (2) for fixed f. We need to know how k, and kz depend 
on f. From the form of Go, we have 


u*(t) = p(c) ( = vid f g2(s) f(s) ds + vat f pils) f(s) ds). 
By inspection, two terms in the differentiation drop out and the derivative is 
w= Po'(— #0 f vorirds +e) | eV FG)ds), 


Evaluation of these formulas at a and b gives 


u*(a) = u*'(a) = 0, 


b b 
u*(b) = po"(- ee f ga(s) f(s) ds + va) | yi(s) f(s) ds), 


b b 
u*’(b) = p(c)'(- vic) [ gr(s) f(s)ds + v5) | yi(s) f(s) ds). 


Thus the right side of the equation (*) that defines k, and kz is of the form 


» Gee + can) ( 0 ) 
dyu*(b) + dou*'(b) ) ~ \ f? ergs) + eogn(s)) f(s) ds 
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where e; and e» are real constants independent of f. Hence k; and k» are of the 


form ; 
e) _ (a (ogi (s) eg 
ko I? (vols) + g2(s)) f(s) ds)’ 


where a, 6, y, 6 are real constants independent of f. The fact that Gy solves 
the system (*) means that the function v(t) given by 


b b 
wore [ (api(s) +B pals) f(s) ds-+oat | (ygi(s)+d¢92(s)) f(s) ds 


satisfies cj v(a) + cov’(a) = O and d,v(b) + dov'(b) = 0. Put 


ee _ ae ae Aad 

Ky(s) YGi(S) + dg2(s) J - 

We can summarize the above computation by saying that the real-valued contin- 
uous function 


G,(t, 5) = Go(t, s) + Ki (s)gi(t) + K2(s)g2(t) 


has, for every continuous f/f , the property that v(t) = fe Gi(t, s) f(s) ds satisfies 
L(v) = f and the condition (SL2). 

Define T\(f)(t) = 2 Gi (t, s) f(s) ds. We have seen that 7; carries C[a, b] 
into D[a, b]and that L(T,(f)) = f. Now suppose that u is in D[a, b]. Since L(u) 
is continuous, T)(L(u)) is in D[a, b] and has L(T,(L(u))) = L(u). Therefore 
T,(L(u)) — wv is in D[a, b] and has L(T;(L(u)) — u) = 0. We have assumed that 
there is no nonzero solution of (SL) for A = 0, and therefore T; (L(u)) = u. Thus 
T, and L are two-sided inverses of one another. 

Finally we are to prove that Gj (t, 5) = Gi(s, t). Let f and g be arbitrary real- 
valued continuous functions on [a, b], and put u = 7;(f) and v = T;(g). We 
know from Green’s formula and (SL2) that (L(u), v) = (u, L(v)). Substituting 
the formulas f = L(u) and g = L(v) into this equality gives 


b eb b 
i I Git.) fOg(s)dsdr = [ f(t)v(t) dt = (L(y), v) 


b b b 
=u. Loy) = | uisyg)ds = [ / Gils, t) f(g(s) dt ds. 


By Fubini’s Theorem the identity 


b pb 
[ [ Ges)-Gie.mFro.naras =o 
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holds when F is one of the linear combinations of continuous functions f (s)g(t). 
We can extend this conclusion to general continuous F by passing to the limit 
and using uniform convergence because the Stone—Weierstrass Theorem shows 
that real linear combinations of products f(t)g(s) are uniformly dense in the 
space of continuous real-valued functions on [a, b] x [a, b]. Taking F(s,t) = 
Gi(t,s)—Gi(s, 1), we see that [” {? (Gi (t, s)—Gi(s, 1)" dt ds = 0. Therefore 
Gi(t,s) — Gi(s,t) = 0 and G,(t, s) = G,(s, t). This completes the proof of 
the lemma. 


HISTORICAL REMARKS. Sturm’s groundbreaking paper appeared in 1836. In 
that paper he proved that the set E in Theorem 1.3’ is infinite by comparing the 
zeros of solutions of various equations, but he did not address the question of 
completeness. Liouville introduced integral equations in 1837. 


4. Problems 


1. Let p, be the n""-smallest positive real number p such that h sin p/+p cos pl = 0, 
as in Example 3 for the heat equation in Section 2. Here h and / are positive con- 
stants. Prove directly that fo Sin P,X SiN Pyx dx = 0 for n ¥ m by substituting 


from the trigonometric identity sina sinb = — 5 (cos(a + b) — cos(a — b)). 


2. Multiplying the relevant differential operators by functions to make them for- 
mally self adjoint, and applying Green’s formula from Section 3, prove the 
following orthogonality relations: 

(a) sin Pi(t)Pin(t) dt = 0 if P, and P,, are Legendre polynomials and in 4 m. 
The m' Legendre polynomial P,, is a certain nonzero polynomial solution 
of the Legendre equation (1 — t?) P” — 2t P’+m(m+1)P =0. It is unique 
up to a scalar factor. These polynomials are applied in the second example 
with the Laplace equation in Section 2. 

(b) he Jo(knr) Jo(Kmr)r dr = Oif k, and k,, are distinct zeros of the Bessel func- 


. . . ‘s 1) 42n 
tion Jo. The function Jo is the power series solution Jo(t) = ah ¢ at 


of the Bessel equation of order 0, namely 7” +ty’ +17y = 0. It is applied 
in the last example of Section 2. 


3. Inthe proof of Lemma 1.4: 
(a) Show directly by expanding out u*(t) = iis Go(t, s) f(s) ds that u* satisfies 
Ltu*) = f. 
(b) Calculate Go(t, s) and G,(t, s) explicitly for the case that L(u) = u” + u 
when the conditions (SL2) are that u(0) = 0 and u(z/2) = 0. 
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4. This problem discusses the starting point for Sturm’s original theory. Suppose 
that p(t), p’(t), gi(t), and go(t) are real-valued and continuous on [a, b] and 
that p(t) > 0 and g(t) > gi(t) everywhere on [a, b]. Let y(t) and yo(t) be 
real-valued solutions of the respective equations 


(PMY) +giOy=0 and (p(t)y') + go(t)y =0. 
Follow the steps below to show that if t; and tz are consecutive zeros of y;(f), 
then y2(t) vanishes somewhere on (1, f2). 
(a) Arguing by contradiction and assuming that y(t) is nonvanishing on (f), f2), 
normalize matters so that y|(¢) > 0 and y2(t) > 0 on (t, t2). Multiply the 
first equation by y2, the second equation by y,, subtract, and integrate over 
[t1, t2]. Conclude from this computation that [ PY; ¥2 — PY1 ys]F > 0. 
Taking the signs of p, y1, y2 and the behavior of the derivatives into account, 
prove that p(t)y) (t)y2(t) — pyi(y5(t) is < 0 at t = ty and is > O at 
t;, 1n contradiction to the conclusion of (a). Conclude that y2(t) must have 
equaled 0 somewhere on (1, f2). 
(c) Suppose in addition that g(t) and r(t) are continuous on [a, b] and that 
r(t) > 0 everywhere. Let y;(t) and y(t) be real-valued solutions of the 
respective equations 


(p(t)y’)’ —q@y+air(t)y =0 and (p(t)y’)’—q(t)y + Aar(t)y = 0, 


where A; and Az are real with Aj < Az. Obtain as a corollary of (b) that 
y2(t) vanishes somewhere on the interval between two consecutive zeros of 
yi(t). 
Problems 5-8 concern Schrédinger’s equation in one space dimension with a time- 
independent potential V (x). In suitable units the equation is 
a°W (x, t) OW (x, ft) 
oe ee + V(x) (x, t) => oo a 
5. (a) Show that any solution of the form V(x,t) = wW(x)@(t) is such that 
w” + (E — V(x))v = 0 for some constant EF. 
(b) Compute what the function g(t) must be in (a). 


6. Suppose that V(x) = x2, so that "+ (E —x2)W =0. Put W(x) =e 2 H(x), 
and show that 


(b 


wm 


H" —2xH'+(E-1)H =0. 
This ordinary differential equation is called Hermite’s equation. 


7. Solve the equation H"” — 2xH' + 2nH = 0 by power series. Show that there 
is a nonzero polynomial solution if and only if n is an integer > 0, and in this 
case the polynomial is unique up to scalar multiplication and has degree n. For 
a suitable normalization the polynomial is denoted by H,,(x) and is called a 
Hermite polynomial. 
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8. Guided by Problem 6, let L be the formally self-adjoint operator 
LO) = yr" = x7. 


Using Green’s formula from Section 3 for this L on the interval [—N, N] and 
letting N tend to infinity, prove that 


N 
lim Hy (x)Hn(xje* dx =0 ifn zm. 
N->oo _—N 


CHAPTER II 


Compact Self-Adjoint Operators 


Abstract. This chapter proves a first version of the Spectral Theorem and shows how it applies to 
complete the analysis in Sturm’s Theorem of Section I.3. 

Section | introduces compact linear operators from a Hilbert space into itself and characterizes 
them as the limits in the operator norm topology of the linear operators of finite rank. The adjoint 
of a compact operator is compact. 

Section 2 proves the Spectral Theorem for compact self-adjoint operators on a Hilbert space, 
showing that such operators have orthonormal bases of eigenvectors with eigenvalues tending to 0. 

Section 3 establishes two versions of the Hilbert-Schmidt Theorem concerning self-adjoint 
integral operators with a square-integrable kernel. The abstract version gives an L? expansion of 
the members of the image of the operator in terms of eigenfunctions, and the concrete version, valid 
when the kernel is continuous and the space is compact metric, proves that the eigenfunctions are 
continuous and the expansion in terms of eigenfunctions is uniformly convergent. 

Section 4 introduces unitary operators on a Hilbert space, establishing the equivalence of three 
conditions that may be used to define them. 

Section 5 studies compact linear operators on an abstract Hilbert space, with special attention 
to two kinds—the Hilbert-Schmidt operators and the operators of trace class. All three sets of 
operators—compact, Hilbert-Schmidt, and trace-class—are ideals in the algebra of all bounded 
linear operators and are closed under the operation of adjoint. Trace-class implies Hilbert-Schmidt, 
which implies compact. The product of two Hilbert-Schmidt operators is of trace class. 


1. Compact Operators 


Let H be a real or complex Hilbert space with inner product! (-, -) and norm 
| - ||. A bounded linear operator L : H — di is said to be compact if L 
carries the closed unit ball of H to a subset of H that has compact closure, i.e., if 
each bounded sequence {u,} in H has the property that {L(u,)} has a convergent 
subsequence.” The first three conclusions of the next proposition together give a 
characterization of the compact operators on H. 


'This book follows the convention that inner products are linear in the first variable and conjugate 
linear in the second variable. 

?Some books use the words “completely continuous” in place of “compact” for this kind of 
operator. 
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Proposition 2.1. Let L : H — H bea bounded linear operator on a Hilbert 
space H. Then 


(a) L is compact if the image of L is finite dimensional, 

(b) L is compact if L is the limit, in the operator norm, of a sequence of 
compact operators, 

(c) L compact implies that there exist bounded linear operators L, : H > H 
such that L = lim L,, in the operator norm and the image of each L,, is 
finite dimensional, 

(d) L compact implies L* compact. 


PROOF. For (a), let M be the image of L. Being finite dimensional, M is 
closed and is hence a Hilbert space. Let {v1,..., vz} be an orthonormal basis. 
The linear mapping that carries each v; to the j" standard basis vector e; in the 
space of column vectors is then a linear isometry of M onto Ré or C*. In R* and 
C*, the closed ball about 0 of radius ||L|| is compact, and hence the closed ball 
about 0 of radius ||Z|| in M is compact. The latter closed ball contains the image 
of the closed unit ball of H under L, and hence L is compact. 

For (b), let B be the closed unit ball of H. Write L = lim L,, in the operator 
norm, each L,, being compact. Since the subsets of acomplete metric space having 
compact closure are exactly the totally bounded subsets, it is enough to prove that 
L(B) is totally bounded. Let € > 0 be given, and choose n large enough so that 
|L,—L|| < €/2. Withn fixed, L,,(B) is totally bounded since L,,(B) is assumed 
to have compact closure. Thus we can find finitely many points v,,..., vg such 
that the open balls of radius €/2 about the v;’s together cover L,(B). We shall 
prove that the open balls of radius € about the v,;’s together cover L(B). In 
fact, if u is given with ||u|| < 1, choose j with ||L,(u) — v;|| < €/2. Then 
|Lu)—vyll < IL@)—Li@|+llLa@)—yll < [Ln-Lililullt+5 < 5+5 =€, 
as required. 

For (c), we may assume that H is infinite dimensional. Since L is compact, 
there exists a compact subset K of H containing the image of the closed unit ball. 
As a compact metric space, K is separable. Let {w,} be a countable dense set, 
and let M be the smallest closed vector subspace of H containing all w,. Since 
the closure of {w,} contains K, M contains K. The subspace M is separable: 
in fact, if the scalars are real, then the set of all rational linear combinations of 
the w,’s is a countable dense set; if the scalars are complex, then we obtain a 
countable dense set by allowing the scalars to be of the form a + bi with a and b 
rational. 

Since M is aclosed vector subspace, it is a Hilbert space and has an orthonormal 
basis S. The set S must be countable since the open balls of radius 1/2 centered at 
the members of S are disjoint and would otherwise contradict the fact that every 
topological subspace of a separable topological space is Lindelof. Thus let us 
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list the members of S as vj, v2,... . For each n, let M, be the (closed) linear 
span of {v},..., V,}, and let E,, be the orthogonal projection on M,,. The linear 
operator E,,L is bounded, being a composition of bounded linear operators, and 
its image is contained in the finite-dimensional space M,,. Hence it is enough 
to show for each € > O that there is some n with ||(1 — E,)L|| < e€. If this 
condition were to fail, we could find some € > 0 such that ||(1 — E,)L|| > ¢ for 
every n. With ¢€ fixed in this way, choose for each n some vector u, of norm 1 
such that ||(1 — E,,)L(u,)|| => €/2. The sequence {L(u,,)} lies in the compact set 
K. Choose a convergent subsequence {L(up,)}, and let v = lim L(u,,). For ng 
sufficiently large, we have ||v — L(uy,)|| < €/4. In this case, 


1 — En, vl] = WA = En Ln) Il — WG — En — Ln Il = 5 — 3 = F- 


On the other hand, v is in M, and v is of the form v = Va, v;)v;. In this 
expression we have E,,(v) = ye (v, vj)v;, and these partial sums converge to 
vin H. Inshort, lim, E,v = v. Then ||(1 — E,,)v|| tends to 0, and this contradicts 
our estimate ||(1 — E;,)u|| = §. 

For (d), first suppose that the image of L is finite dimensional, and choose an 
orthonormal basis {u;,..., u,} of the image. Then L(u) = ae (L(y), uj)uj = 
aa (u, L*(uj))u;. Taking the inner product with v gives (u, L*(v)) = 
(L(u),v) = I (u, L*(uj))(uj, v). This equality shows that L*(v) and 
i= (v, uj)L*(u ;) have the same inner product with every u. Thus they must 
be equal, and we conclude that the image of L”* is finite dimensional. 

Now suppose that L is any compact operator on H. Given € > 0, use (c) 
to choose a bounded linear operator L,, with finite-dimensional image such that 
||L — Lnl|| < €. Since a bounded linear operator and its adjoint have the same 
norm, ||L* — L*|| < e€. Since L* has finite-dimensional image, according to what 
we have just seen, and since we can obtain such an approximation for any € > 0, 
(b) shows that L* is compact. 


2. Spectral Theorem for Compact Self-Adjoint Operators 


Let L : H — H bea bounded linear operator on the real or complex Hilbert 
space H. One says that a nonzero vector vu is an eigenvector of L if L(v) = cv for 
some constant c; the constant c is called the corresponding eigenvalue. The set 
of all u for which L(u) = cu is a closed vector subspace; under the assumption 
that this subspace is not 0, it is called the eigenspace for the eigenvalue c. 

In the finite-dimensional case, the self-adjointness condition L* = L means 
that L corresponds to a Hermitian matrix A, i.e., a matrix equal to its conjugate 
transpose, once one fixes an ordered orthonormal basis. In this case it is shown 
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in linear algebra that the members of an orthonormal basis can be chosen to 
be eigenvectors of L, the eigenvalues all being real. In terms of matrices, the 
corresponding matrix A is conjugate via a unitary matrix, i.e., a matrix whose 
conjugate transpose is its inverse, to a diagonal matrix with real entries. This result 
is called the Spectral Theorem for such linear operators or matrices. A quick proof 
goes as follows: An eigenvector v of L with eigenvalue c has (L—c/)(v) = 0, and 
this implies that the matrix A of L has the property that A — c/ has a nonzero null 
space. Hence det(A — c/) = 0 if and only if c is an eigenvalue of L. One readily 
sees from the self-adjointness of L that all complex roots of det(A — c/) have to 
be real. Moreover, if L carries a vector subspace M/ into itself, then L carries M as 
into itself as well. Finite-dimensionality forces A to have a complex eigenvalue, 
and this must be real. Hence there is a nonzero vector u with L(u) = cu for 
some real c. Normalizing, we may assume that u has norm 1. If M consists of 
the scalar multiples of u, then L carries M + to itself, and the restriction of L 
to M+ is self adjoint. Proceeding inductively, we obtain a system of orthogonal 
eigenvectors for L, each of norm 1. 

A certain amount of this argument works in the infinite-dimensional case. In 
fact, suppose that L is self adjoint. Then any wu in H has 


(L(u), u) = (u, L*(u)) = (u, L(u)) = (LW), u), 


and hence the function u +> (L(u), uv) is real-valued. If u is an eigenvector in 
H with eigenvalue c, ie., if L(u) = cu, then c(u, uv) = (L(u), u) is real; since 
(u, u) is real and nonzero, c is real. If wu, and u2 are eigenvectors for distinct 
eigenvalues c; and cp, then uw; and uv are orthogonal because 


(cy — C2)(u4, U2) = (C14, U2) — (U4, CouU2) = (L(u1), U2) — (uy, L(u2)) = 0. 


If M is a vector subspace of H with L(M) C M, then also L(M+) Cc M+ 
because m € M and m+ € M* together imply 


0 = (Lim), m+) = (m, L(m*)). 


These observations prove everything in the following proposition except the last 
statement. 


Proposition 2.2. If L : H — dH is a bounded self-adjoint linear operator 
on a Hilbert space H, then u +> (L(u), u) is real-valued, every eigenvalue of L 
is real, eigenvectors under L for distinct eigenvalues are orthogonal, and every 
vector subspace M with L(M) C M has L(M“) C M+. In addition, 


|L|| = sup |(L@), u))}. 


ells 
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PROOF. We are left with proving the displayed formula. Inequality in one 
direction is easy: we have 


ae \(L(u),u))| < ae (LW), v)| = IIL]. 
7 vst 


With C = sup), <;(L(u), u)), we are therefore to prove that ||L|| < C, hence 
that ||Z(w)|| < Cllu|| for all wu. In doing so, we may assume that u 4 O and 
L(u) £0. Lett be a positive real number. Since (L?(u), u) = (L(u), L(u)), we 
have 


IL@IP 
= 1] (Lau+e'Lw), tu+t-'L(w)) — (Lut! Lw)), 1-1" L(w))| 


< i[Cla + Lal? + Clim = Lew P| 


= 3C[[leull? +e LOI? ], 


the last step following from the parallelogram law. By differential calculus 
the minimum of an expression a*t? + b?t~*, in which a and b are positive 
constants, is attained when t? = b/a. Here a = |\u|| and b = ||L(u)||, and 
thus ||Lw)|?_ < S[IL@I|lell + IL@I| ell] = ClL@)Illlul|. Dividing by 
|L(u)|| gives ||Z(@)|| < C||u|] and completes the proof. 


In the infinite-dimensional case, in which we work with the operator L but 
no matrix, consider what is needed to imitate the proof of the finite-dimensional 
Spectral Theorem and thereby find an orthonormal basis of vectors carried by L 
to multiples of themselves. In the formula of Proposition 2.2, if we can find some 
u with ||u|| = 1 such that ||L|| = |(L(w), w)|, then this wu satisfies ||L||||u||? = 
(Lu), u)| < ||L@)|Ilull < Lilllull?, and we conclude that |(L(u),u)| = 
|L(u)|| |u|], 1-e., that equality holds in the Schwarz inequality. Reviewing the 
proof of the Schwarz inequality, we see that L(u) and u are proportional. Thus u 
is an eigenvector of L, and we can at least get started with the proof. 

Unfortunately an orthonormal basis of eigenvectors need not exist for a self- 
adjoint L without an extra hypothesis. In fact, take H = L?({0, 1]) with (f, g) = 
Fi fgdx, and define L(f)(x) = xf(x). This linear operator L has norm 1, 
and the equality (f, L(g)) = ie xf (x)g(x)dx = (L(f), g) shows that L is 
self adjoint. On the other hand, the only function f with xf = cf a.e. for some 
constant c is the 0 function. Thus we get no eigenvectors at all, and the supremum 
in the formula of Proposition 2.2 need not be attained. 
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The hypothesis that we shall add to obtain an orthonormal basis of eigenvectors 
is that L is compact in the sense of the previous section. Each compact self-adjoint 
operator has an orthonormal basis of eigenvectors, according to the following 
theorem. 


Theorem 2.3 (Spectral Theorem for compact self-adjoint operators). Let 
L : H — H be a compact self-adjoint linear operator on a real or complex 
Hilbert space H. Then H has an orthonormal basis of eigenvectors of L. In 
addition, for each scalar A, let 


Ay, = {u € H | L(u) = du}, 


so that H, — {0} consists exactly of the eigenvectors of L with eigenvalue A. 
Then the number of eigenvalues of L is countable, the eigenvalues are all real, 
the spaces H, are mutually orthogonal, each H, for 4 4 0 is finite dimensional, 
any orthonormal basis of H of eigenvectors under L is the union of orthonormal 
bases of the H,’s, and for any € > 0, there are only finitely many 4 with H, 4 0 
and |A| > €. Moreover, either or both of ||L|| and —||Z|| are eigenvalues, and 
these are the eigenvalues with the largest absolute value. 


PROOF. We know from Proposition 2.2 that the eigenvalues of L are all real 
and that the spaces H, are mutually orthogonal. In addition, the formula ||L|| = 
SUP jx <1 || L(U)|| shows that no eigenvalue can be greater than ||L|| in absolute 
value. 

The theorem certainly holds if L = 0 since every nonzero vector is an eigen- 
vector. Thus we may assume that ||L|| > 0. 

The main step is to produce an eigenvector with one of || L || and —||L|| as eigen- 
value. Taking the equality || L|| = supy,,)<; |(L(w), u))| of Proposition 2.2 into ac- 
count, choose a sequence {u,,} with ||, || = 1 such that lim, |(L(un), un)| = ILI. 
Since the proposition shows that (L(u;,), Un) has to be real, we may assume that 
this sequence is chosen so that 7 = lim, (L(u,), u,) exists. Then A = +||L]l. 
Using the compactness of L and passing to a subsequence if necessary, we may 
assume that L(u,) converges to some limit vg. Meanwhile, 


0 < |[L(tn) — Auall? = |L(an) I? — 24 Re(L (un), Un) + A? ||? 
< ||L\)? — 24 Re(L(uy), Un) + 27. 


The equalities A* = || L||? and lim, (L(u,), Un) = 4 show that the right side tends 
to 0, and thus lim, ||L(u,) — Au,|| = 0. Since lim, ||L(u,) — vo|| = 0 also, the 
triangle inequality shows that lim Au, exists and equals ug. Since 1 ¥ 0, limu, 
exists and vg = Alimu,. Consequently ||vo|| = |A| lim ||u, || = |A| = ||Z]| 40. 
Applying L to the equation vy) = Alimu, and taking into account that L is 
continuous and that lim L(u,) = vo, we see that L(vp) = Avo. Thus vo is an 
eigenvector with eigenvalue 4, and the main step is complete. 
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Now consider the collection of all orthonormal systems of eigenvectors for 
L, and order it by inclusion upward. A chain consists of nested such systems, 
and the union of the members of a chain is again such an orthonormal system. 
By Zorn’s Lemma the collection contains a maximal element S. Let M be the 
smallest closed vector subspace containing this maximal orthonormal system S of 
eigenvectors. Since the collection of all finite linear combinations of members of 
S is dense in M, the continuity of L shows that L(M) C M. By Proposition 2.2, 
L(M+) C M+. The equality (L(u), v) = (u, L(v)) for any two members wu and 
v of M+ shows that the restriction of L to M+ is self adjoint, and this restriction 
is certainly bounded and compact. Arguing by contradiction, suppose M+ + 0. 
Then either L = 0 or else L ¥ O and the main step above shows that L has an 
eigenvector in M+. Thus L has an eigenvector vp of norm 1 in M* in either 
case. But then SU {vo} would be an orthonormal system of eigenvectors properly 
containing S, in contradiction to the maximality. We conclude that M+ = 0. 
Since M is a closed vector subspace of H, it satisfies Mt+ = M. Therefore 
M = (M+)+ = 0+ = H, and H has an orthonormal basis of eigenvectors. 

With the orthonormal basis S = {vq} of eigenvectors fixed, consider all vq’s 
for which the corresponding eigenvalue A, has |A,| > €. If a; and a2 are two 
distinct such indices, we have 


[|Z (vay) — LU II? = [dcr Yay — Aa Var lI? 
= |lAg, Va, (P+ l|Aa, Vey \| by the Pythagorean theorem 
= [Ray l? + lag!” 


> je 


If there were infinitely many such eigenvectors v,,, the bounded sequence 
{L (vq, )} could not have a convergent subsequence, in contradiction to compact- 
ness. Thus only finitely many members of S have eigenvalue with absolute value 
> eé. 

Fix 24 4 0, let S, be the finite set of members of S with eigenvalue 2, and 
let H, be the linear span of S,. If v is an eigenvector of L for the eigenvalue A 
beyond the vectors in H,,, then the expansion 


v= >) @,ve)vet+ D> (0, Ua) 


Ve E Sy, Vee S—S), 


shows that (v, vy) 4 0 for some vy in S — S,. This vg must have eigenvalue A’ 
different from 2, and then Proposition 2.2 gives the contradiction (v, Ug) = 0. We 
conclude that H) is the entire eigenspace for eigenvalue A and that the eigenvalues 
of the members of S are the only eigenvalues of L. 
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For each positive integer n, we know that only finitely many eigenvalues 1 
corresponding to members of S have |A| > 1/n. Since every eigenvalue of L 
is the eigenvalue for some member of S, the number of eigenvalues 4 of L with 
|A| > 1/n is finite. Taking the union of these sets as n varies, we see that the 
number of eigenvalues of L is countable. This completes the proof. 


3. Hilbert-—Schmidt Theorem 


The Hilbert—Schmidt Theorem was postponed from Section 1.3, where it was used 
in connection with Sturm—Liouville theory. The nub of the matter is the Spectral 
Theorem for compact self-adjoint operators on a Hilbert space, Theorem 2.3. 
But the actual result quoted in Section I.3 contains an overlay of measure theory 
and continuity. Correspondingly there is an abstract Hilbert-Schmidt Theorem, 
which combines the Spectral Theorem with the measure theory, and then there 
is a concrete form, which adds the hypothesis of continuity and obtains extra 
conclusions from it. 

The abstract theorem works with an integral operator on L? of a o-finite 
measure space (X, i), the operator being of the form 


p(s) = f K(x, ») f(y) duty), 
x 


where K (x, y) is measurable on X x X. The function K is called the kernel of 
the operator. If f is in L7(X, yw), then the Schwarz inequality gives |T f (x)| < 
K(x, - Ili fll, for each x in X. Squaring both sides, integrating, and taking the 


square root yields ||7/f||, < ‘Cae |K|/>d(u x )) If lle: As a linear operator 
on L?(X, 1), T therefore has operator norm satisfying 


; 1/2 
tts (f fixes du(x)duy)) = IK lh 


In particular, T is bounded if K is square-integrable on X x X. In this case the 
adjoint of T is given by 


T*g(x) = | KG. x)e(y) duty) 
x 


because (Tf,g) = fy fy K(x, y) f (g(x) du(y) d(x) and because the as- 
serted form of 7* has 


(f.T*8) = Sy FOL, KO. 80) duQ)) dua) 
= fy Sy OOK. EO) du) duc). 


3Not to be confused with the abstract-algebra notion of “kernel” as the set mapped to 0. 
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Theorem 2.4 (Hilbert-Schmidt Theorem, abstract form). Let (X, jw) be a 
o-finite measure space, and let K(-, -) be a complex-valued L? function on 
X x X such that K(x, y) = K(y,x) for all x and y in X. Then the linear 
operator T defined by 


(Tf)(x) = / Ka, yf) duly) 
xX 


is a self-adjoint compact operator on the Hilbert space L?(X, w) with ||T|| < 
|| K ||,. Consequently if foreach complex A 4 0,a vector subspace V,, of L?(X, 11) 
is defined by 


Y= {fel u)|TF=Af}, 


then each V, is finite dimensional, the space V;, is nonzero for only countably 
many A, the spaces V, are mutually orthogonal with respect to the inner product 
on L?(X, j), the A’s with V;, # 0 are all real, and for any € > 0, there are only 
finitely many 4 with V, 4 O and |A| > €. The largest value of |A| for which 
V, 4 0 is ||7||. Moreover, the vector subspace of L? orthogonal to all V;, is the 
kernel of T, so that if vj, v2,... is an enumeration of the union of orthonormal 
bases of the spaces V, with A # 0, then for any f in L7(X, p)s 


Tf >> PF wun 
n=1 


the series on the right side being convergent in L?(X, 1). 


PROOF. Theorem 2.3 shows that it is enough to prove that the self-adjoint 
bounded linear operator T is compact. Choose a sequence of simple functions 
K, square integrable on X x X such that lim, ||K — K,||, = 0, and define 
T, f(x) = im Kn(x, y) f(y) du(y). The linear operator T, is bounded with 
7, || < ||Knll,, and it has finite-dimensional image since K, is simple. By 
Proposition 2.la, T,, is compact. Since ||T — T,|| < ||K — K,||, and since the 
right side tends to 0, T is exhibited as the limit of T,, in the operator norm and is 
compact by Proposition 2.1b. 


Now we include the overlay of continuity. The additional assumptions are that 
X is a compact metric space, jz is a Borel measure on X that assigns positive 
measure to every nonempty open set, and K is continuous on X x X. The 
additional conclusions are that the eigenfunctions for the nonzero eigenvalues are 
continuous and that the series expansion actually converges absolutely uniformly 
as well as in L?. The result used in Section I.3 was the special case of this result 
with X = [a, b] and wu equal to Lebesgue measure. 
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Theorem 2.5 (Hilbert-Schmidt Theorem, concrete form). Let X be a compact 
metric space, let 4 be a Borel measure on X that assigns positive measure to every 
nonempty open set, and let K(-, -) be acomplex-valued continuous function on 
X x X such that K (x, y) = K(y, x) forall x and y in X. Then the linear operator 
T defined by 


rf) = | K(x, ») f(y) duty), 
x 


is a self-adjoint compact operator on the Hilbert space L?(X, w) with ||T|| < 
|K ||,, and its image lies in C(X). Consequently the vector subspace V, of 
L?(X, 2) defined for any complex A 4 0 by 


y= {f el’X,u)|TF=Af} 


consists of continuous functions, each V),, is finite dimensional, the space V,, is 
nonzero for only countably many 1, the spaces V, are mutually orthogonal with 
respect to the inner product on L?(X, 2), the 4’s with V, # O are all real, and for 
any € > O, there are only finitely many A with V, 4 0 and |A| > e. The largest 
value of |A| for which V;, 4 Ois ||T'||. If v1, v2, ... is an enumeration of the union 
of orthonormal bases of the spaces V, with A ~ 0, then for any f in EA (Xa py, 


CO 


Lia= > Cpu vse), 


n=1 
the series on the right side being absolutely uniformly convergent for x in X. 


REMARK. The hypothesis that jz assigns positive measure to every nonempty 
open set is used only to identify )°°° , (Tf, Un)Un(x) with Tf (x) at every point. 
Without this particular hypothesis on jy, the series is still absolutely uniformly 
convergent, but its sum is shown to equal T f(x) only almost everywhere with 
respect to wu. 


PROOF. Given € > 0, choose 6 > O by uniform continuity of K such that 
|K (x, y) — K (xo, yo)| < € whenever (x, y) and (xo, yo) are at distance < 6. If f 
is in L?(X, jz) and the points x and xo are at distance < 6, then (x, y) and (xo, y) 
are at distance < 6 and hence 


IT f(x) — Tf Qo) < fy 1K @, y) — Ko, MIF OI duo) 
<e fy lfOIlduQ) < €llfll,u(xX))'”, 
the last step following from the Schwarz inequality. This proves that Tf is 


continuous for each f in L*(X, 2). In particular, if Tf = Af with A 4 0, 
then f = T(A7' f) exhibits f as in the image of T and therefore as continuous. 
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Everything in the theorem now follows from Theorem 2.4 except for the absolute 
uniform convergence to 7 f (x) in the last sentence of the theorem. 

For the absolute uniform convergence, let (-, - ) denote the inner product in 
L?(X, 2). We begin by considering the function K (x, - ) for fixed x. It satisfies 


(K(x, +), Un) = fy K@, y) ny) du(y) = (Tn) () = An Un (x) 


if v, is in V,,, and Bessel’s inequality gives 


N 
Yo ln lun)? < | [K (x, y)? day) < IK Map (X) (*) 
n=1 x 


for all N and x. Since the v, form an orthonormal basis of Ve 
limy—soo 78 — Dp-1 (8. Un)Unl|, = 0 (46) 
for all g in L?(X, jz). Meanwhile, we have 
(18, Un)Un(X) = (8, Tp) Up (X) = An (8; Vn) Un). 


Application of the Schwarz inequality and (*) gives 


N N 
dX (Pg, vn)vn(x)| = DO [An(g, vn) n(x)! 


n=M n=M 


x PaPiocor) (S \(g. on) P) 


1/2 


lA 


1/2 “ 2 me 
< IK lwpt(X)'?( Yo I wd?) - 
n=M 


Bessel’s inequality shows that the series )-°° , |(g, Un)|? converges and has sum 
< lg ll5. Therefore yy l(g, Un)|? tends to 0 as M and N tend to infinity, and 
the rate is independent of x. Consequently the series Sar (Tg, Up) Un (X)| is 
uniformly Cauchy, and it follows that the series ar (Tg, Un)U,(X) is abso- 
lutely uniformly convergent for x in X. Since the uniform limit of continuous 
functions is continuous, the sum has to be a continuous function. Since («) 
shows that ye, Un)Up converges in L?(X , 4) to Tg, a subsequence of 
yo (Tg, Un)Un(x) converges almost everywhere to T g(x). Since Tg is con- 
tinuous, the set where pow (Tg, Un)Un(x) # Tg(x) is an open set. The fact 
that this set has measure 0 implies, in view of the hypothesis on jz, that this set is 
empty. Thus eae (Tg, Un) Un (x) converges absolutely uniformly to T g(x). 
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4. Unitary Operators 


In C¥, a unitary matrix corresponds in the standard basis to a unitary linear 
transformation U, i.e., one with U* = U~!. Such a transformation preserves 
inner products and therefore carries any orthonormal basis to another orthonor- 
mal basis. Conversely any linear transformation from C% to itself that carries 
some orthonormal basis to another orthonormal basis is unitary. For the infinite- 
dimensional case we define a linear operator to be unitary if it satisfies the 
equivalent conditions in the following proposition.* 


Proposition 2.6. If V is a real or complex Hilbert space, then the following 
conditions on a linear operator U : V — V are equivalent: 
(a) UU* =U*U = 1, 
(b) U is onto V,and (Uv, Uv’) = (v, v’) for all v and v’ in V, 
(c) U is onto V,and ||Uv|| = ||v|| for all vin V. 


A unitary operator carries any orthonormal basis to an orthonormal basis. Con- 
versely if {u;} and {v;} are orthonormal bases, then there exists a unique bounded 
linear operator U such that Uu; = v; for alli, and U is unitary. 


REMARKS. In the finite-dimensional case the condition “U U* = 1” in (a) and 
the condition “U is onto V” in (b) and (c) follow from the rest, but that implication 
fails in the infinite-dimensional case. Any two orthonormal bases have the same 
cardinality, by Proposition 12.11 of Basic, and hence the index sets for {u;} and 
{u;} in the statement of the proposition may be taken to be the same. 


PRooF. If (a) holds, then UU* = 1 proves that U is onto, and U*U = 1 
proves that (Uv, Uv’) = (U*Uv, v’) = (v, v’). Thus (b) holds. In the reverse 
direction, suppose that (b) holds. From (U*Uv, v') = (Uv, Uv’) = (v, v’) for 
all v and v’, we see that U*U = 1. Thus U is one-one. Since U is assumed onto, 
it has a two-sided inverse, which must then equal U%* since any left inverse equals 
any right inverse. Thus (a) holds, and (a) and (b) are equivalent. Conditions (b) 
and (c) are equivalent by polarization. 

If {u;} is an orthonormal basis and U is unitary, then (Uu;, Uuj) = (uj, uj) = 
6;; by (b), and hence {Uu;} is an orthonormal set. If (v, Vu;) = 0 for all 7, then 
(U*v, u;) = 0 for alli, U*v = 0, and v = U(U*v) = UN = 0. So {Uu;} is an 
orthonormal basis. 

If {u;} and {v;} are orthonormal bases, define U on finite linear combinations 


of the u; by U(d; ciuli) = a cju;. Then |U(>; ciui) | = | bie civil” = 


4This book uses the term “unitary” for both real and complex Hilbert spaces. A unitary linear 
operator from a real Hilbert space into itself is traditionally said to be orthogonal, but there is no 
need to reject the word “unitary” for real Hilbert spaces. 


46 II. Compact Self-Adjoint Operators 


NG. ? = || bee ||’. Hence U extends to a bounded linear operator on V, 
necessarily preserving norms. It must be onto V since it preserves norms and 
its image contains the dense set of finite linear combinations )°; c;v;. Thus (c) 
holds, and U is unitary. 


Since unitary operators are exactly the invertible linear operators that preserve 
inner products, they are the ones that serve as isomorphisms of a Hilbert space with 
itself. Theorem 2.3 and Proposition 2.6 together give us a criterion for deciding 
whether two compact self-adjoint operators on a Hilbert space are related to each 
other by an underlying isomorphism of the Hilbert space: the criterion is that the 
two operators have the same eigenvalues, that the dimension of the eigenspace for 
each nonzero eigenvalue of one operator match the dimension of the eigenspace 
for that eigenvalue of the other operator, and that the Hilbert-space dimension of 
the zero eigenspaces of the two operators match. 


5. Classes of Compact Operators 


In this section we bring together various threads concerning compact operators, 
integral operators, the Hilbert-Schmidt Theorem, the Hilbert-Schmidt norm of 
a square matrix, and traces of matrices. The end product is to consist of some 
relationships among these notions, together with the handy notion of the trace of 
an operator. Once we have multiple Fourier series available as a tool in the next 
chapter, we will be able to supplement the results of the present section and obtain 
a formula for computing the trace of certain kinds of integral operators. Let us 
start with various notions about bounded linear operators from an abstract real 
or complex Hilbert space V to itself, touching base with familiar notions when 
V=C; 

Compact linear operators were discussed in Section 1. Compactness means 
that the image of the closed unit ball has compact closure in V. We know 
from Proposition 2.1 that the compact linear operators are exactly those that can 
be approximated in the operator norm topology by linear operators with finite- 
dimensional image. The adjoint of a compact linear operator is compact. Being 
the members of the closure of a vector subspace, the compact linear operators form 
a vector subspace. When V = C”, every linear operator is of course compact. 

If L is a compact linear operator, then LA and AL are compact whenever 
A is a bounded linear operator. In fact, if L, is a sequence of linear operators 
with finite-dimensional image such that ||L — L,|| — 0, then ||LA — L, All < 
|L — Ln||\|Al| — 0; since L,,A has finite-dimensional image, LA is compact. 
To see that AL is compact, we take the adjoint: L* is compact, and hence 
L* A* = (AL)* is compact; since (AL)* is compact, so is AL. In algebraic 
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terminology the compact linear operators form a two-sided ideal in the algebra 
of all bounded linear operators. 

Next we introduce Hilbert—-Schmidt operators. If L is a bounded linear operator 
on V and if {u;} and {v;} are orthonormal bases of V, then Parseval’s equality 
gives 


a \| Lu; ||? = ij \(Luj, vl? = yj |i, L* vj)? 
= ae |(L*v;, ui)|? _ ij \(L* uj, ui)|? _ ae |L*u; I’. 


Application of this formula twice shows that if we replace {u;} by a different 
orthonormal basis {u/}, we get >; || Lu; ||? = 2; || Lu'||?. The expression 


ILllts = )0 WLuill? = 0 (Lui, oP, 
i i,j 


which we therefore know to be independent of both orthonormal bases {u;} and 
{v;}, is the square of what is called the Hilbert-Schmidt norm || L||,,;, of L. 

For the finite-dimensional situation in which the underlying Hilbert space is 
IR" or C", we can take {u;} and {v;} both to be the standard orthonormal basis, and 
then the Hilbert—-Schmidt norm of the linear function corresponding to a matrix 

se 1/2 
A is just (S:; |Ai;|7) ie 

Our computation with ||L||,;, above shows that 

IZ llus = IL" lus: 

The bounded linear operators that have finite Hilbert-Schmidt norm are called 
Hilbert-Schmidt operators. The name results from the following proposition. 


Proposition 2.7. Let (X, 1) be a o-finite measure space such that L?(X, jw) 
is separable, and let K(-, -) be a complex-valued L? function on X x X. Then 
the linear operator T defined by 


(TA\(x) = i Kee, yf) duly) 
».4 


is a compact operator on the Hilbert space L?(X, jz) with ||T llus = IK llo- 
REMARK. No self-adjointness is assumed in this proposition. 


Proor. If {u;} is an orthonormal basis of L?(X, 2), then the functions 
(uj ® uj)(x, y) = uj(*)ui(y) form an orthonormal basis of L?(X x X, x p) 
as a consequence of Proposition 12.9 of Basic. Hence 


(Tuj,uj) = fy fy Kx, yui(y)ujr) du(x) du(y) = (K, (uj ® i). 


Taking the square of the absolute value of both sides and summing on i and j, 
we obtain ||T |IZ,5 = ||K |l5. 
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Returning to an abstract Hilbert space V and the bounded linear operators on 
it, let us observe for any L that 


LI < WL llus- 


In fact, if uw in V has ||u|| = 1, then the singleton set {u} can be extended to an 
orthonormal basis {u;}, and we obtain ||Lu||? <)>; ||Luj|l? = Lili. Taking 
the supremum over u with ||u|| = 1, we see that ||L||? < eles Two easier but 
related inequalities are that 


AL lus S AML Is = and LAllys S ATL Ins. 


The first of these follows from the inequality || ALu;||? < || Al|*||Lu;||* by sum- 
ming over an orthonormal basis. The second follows from the first because 
IL Allas = LA)" llas = WA*L igs < ATUL? gs = WAIL igs: 

Any Hilbert-Schmidt operator is compact. In fact, if L is Hilbert-Schmidt, 
let {u;} be an orthonormal basis, let € > 0 be given, and choose a finite set F 
of indices i such that Vig pllLui |? < €. If E is the orthogonal projection on 
the span of the u; for i in F, then we obtain ||L* — EL*||* = ||L — LE|/? < 
|L — DE |e = 5, (LZ - LE)uj|\> < €. Hence L* can be approximated in 
the operator norm topology by operators with finite-dimensional image and is 
compact; since L* is compact, L is compact. 

The sum of two Hilbert-Schmidt operators is Hilbert-Schmidt. In fact, we 
have ||(L + M)ujl| < ||Lujll + ||Mu;j|| < 2 max{||Lu;|l, || Mu;||}. Squaring gives 
(ZL + Myuj\? < 4max{||Luj|l’, |Mujll?} < 4(Luill? + Mu; ||), and the 
result follows when we sum on i. Thus the Hilbert-Schmidt operators form a 
vector subspace of the bounded linear operators on V, in fact a vector subspace 
of the compact operators on V. As is true of the compact operators, the Hilbert— 
Schmidt operators form a two-sided ideal in the algebra of all bounded linear 
operators; this fact follows from the inequalities ||AL||,, < ||All||Lllys and 
ILAllas < WALI Lllys- 

The vector space of Hilbert-Schmidt operators becomes a normed linear space 
under the Hilbert-Schmidt norm. Even more, it is an inner-product space. To see 
this, let L and M be Hilbert—Schmidt operators, and let {u;} be an orthonormal 
basis. We define (L, M) = )°; (Lu;, Mu;). This sum is absolutely convergent 
as we see from two applications of the Schwarz inequality: }°; |(Lu;, Mu;)| < 
Yeu Maal < (X Laall?)(, Mul?) = [LlyslIM lls < 00. 
Substituting from the definitions, we readily check that 


5s PL +ikM|l2 if V is real, 
ke{0,2} 
(L, M) = 3 ok 
Sle ik Mle if V is complex. 
k=0 
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Hence the definition of (L, M) is independent of the orthonormal basis. It is 
immediate from the definition and the above convergence that the form (-, -) 
makes the vector space of Hilbert-Schmidt operators into an inner-product space 
with associated norm || - ||. 

If L has finite-dimensional image, then L is a Hilbert-Schmidt operator. In 
fact, let E be the orthogonal projection on image L, take an orthonormal basis 
{u; | i € F} of image ZL, and extend to an orthonormal basis {u; | i € S} 
of V; here F is a finite subset of S. Then }°,., |Lu; ||? = Sieg |ELu; |? = 
Dies WL*Euill? = Y;<¢ ||L*uill? < oo. Thus the Hilbert-Schmidt operators 
form an ideal between the ideal of compact operators and the ideal of operators 
with finite-dimensional image. 

Now we turn to a generalization of the trace Tr A = )°, Aj; of a square matrix 
A. This generalization plays a basic role in distribution theory, in index theory 
for partial differential equations, and in representation theory. In this section we 
shall describe the operators, and at the end of Chapter III we shall show how 
traces can be computed for simple integral operators. Realistic applications tend 
to be beyond the scope of this book. 

Although the trace of a linear operator on C” may be computed as the sum of 
the diagonal entries of the matrix of the operator in any basis, we shall continue 
to use orthonormal bases. Thus the expression we seek to extend to any Hilbert 
space V is }°; (Lu;, uj). The operators of “trace class” are to be a subset of the 
Hilbert—Schmidt operators. It might at first appear that the condition to impose for 
the definition of trace class is that )°; (Lu;, u;) be absolutely convergent for some 
orthonormal basis, but this condition is not enough. In fact, if a bounded linear 
operator L is defined on a Hilbert space with orthonormal basis u1, u2,... by 
Lu; = u;+, for alli, then (Lu;, u;) = 0 for all 7; on the other hand, \|Lu; ||? = 1 
for alli, and L is not Hilbert—Schmidt. 

We say that a bounded linear operator L on V is of trace class if it is a 
compact operator? such that )°; |(Luj, vj)| < oo for all orthonormal bases 
{u;} and {v;}. Since compact operators are closed under addition and under 
passage to adjoints, we see directly from the definition that the sum of two trace- 
class operators is of trace class and that the adjoint of a trace-class operator is 
of trace class. The operator L = B*A with A and B Hilbert-Schmidt is an 
example of a trace-class operator. In fact, the operator L is compact as the 
product of two compact operators; also, (Luj, vj) = (B* Auj, vj) = (Auj, Bu), 
and we therefore have )°; |(Luj, v;)| = 0; |((Aui, Buj)| < >>; || Aui|l||Buill < 


>This condition is redundant; it is enough to assume boundedness. However, to proceed without 
using compactness of L, we would have to know that L*L has a “positive semidefinite” square root, 
which requires having the full Spectral Theorem for bounded self-adjoint operators. This theorem 
is not available until the end of Chapter IV. The development here instead gets by with the Spectral 
Theorem for compact self-adjoint operators (Theorem 2.3). 
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1/2 1/2 ‘ a5 
(2, Auill2)'(%; Burl?) = |/AllysllBllys- The following proposition 
shows that there are no other examples. 


Proposition 2.8. If L : V — V isa trace-class operator on the Hilbert space 
V, then L factors as L = B*A with A and B Hilbert-Schmidt. Moreover, the 
supremum of )°; |(Lu;, v;)| over all orthonormal bases {u;} and {v;} equals the 
infimum, over all Hilbert-Schmidt A and B such that L = B*A, of the product 
Allys ll B lls: 


PROOF. First we produce a factorization. Since L is a compact operator, 
L*L is a compact self-adjoint operator, and Theorem 2.3 shows that L*L has an 
orthonormal basis of eigenvectors w, with real eigenvalues A; tending to 0. Since 
Ai(w;, wi) = (L*Lw;, w;) = (Lu;, Lw;), we see that all A; are > 0. Define a 
bounded linear operator T by Tw; = /A; w; for all i. The operator T is self 
adjoint, it has (Tv, v) > 0 for all v, its kernel N is the smallest closed vector 
subspace containing all the w; with A; = 0, and its image is dense in N+. Since 
NO N+ = 0, T is one-one from N+ into N+. Thus Tv +> Lv is a well- 
defined linear function from a dense vector subspace of N+ into V. The map 
Tv +> Lv has the property that ||Lu||? = (Lv, Lv) = (L*Lv, v) = (Tv, v) = 
(Tv, Tv) = ||Tv||?. Thus Tv + Lv is a linear isometry from a dense vector 
subspace of N+ into V. Since V is complete, Tv +> Lv extends to a linear 
isometry U : N+ + V. This U satisfies L = UT. 

Let J be the set of indices i for the orthonormal basis {w;}, and let P be the 
subset with A; > 0. By polarization, U preserves inner products in carrying N+ 
into V. Extend U to all of V by setting it equal to 0 on N, so that U* is well 
defined. The system {w;};<p is an orthonormal basis of NV + and hence the system 
{ fijicp with fj = Uw; fori € P is anorthonormal set in V. Since U : Nt+v 
is isometric, we have (w;, U*f;) = (Uw;, fi) = (Uw;, Uw;) = (uj, w;). Since 
Tw; is a multiple of w;, we obtain (Tw,, U*f;) = (Tw;, w;). Therefore 


DY (Lwy, fi)| = » (UT wi, fol = ds (Twi, U*fi)| 


ieP ieP 
=) |(Tuj;, wi)| = Do (Twi, wi). 
ieP ieP 


Extend {fi}icp to an orthonormal basis { f;} of V; since any two orthonormal 
bases of a Hilbert space have the same cardinality, we can index the new vectors 
of this set by J — P. The operators L and T have the same kernel, and thus the 
sums fori € P can be extended over alli in J to give 


» \(Lwi, fi)| ge 


Define a bounded linear operator S on V by Su; = 4/2; w; for all i. Then 
|(Sw;, w;) |" = bij (S?w;, wi) = 6i;(T wi, w;), and hence S is a Hilbert-Schmidt 
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operator with IS lis = Vie, (Tui, wi). Take A = S and B* = US; each 
of these is Hilbert-Schmidt since ||US|lyg5 < ||U||||Sllyg, and we have B*A = 
USS = UT = L. This proves the existence of a decomposition B* A = L. 

For the bases {w;} and {f;}, we have just seen that 


lAlluslBllas < USllyslUUSllys < Silas = 0 Twi, w) = DO (Lu, fl. 
iel ie] 


But if L = B’* A’ is any decomposition of L as the product of Hilbert—Schmidt 
operators and if {u;} and {v;} are any two orthonormal bases, we have 


DY [(Luj, vi)| = E(B A'u;, vi)| = D0 (Ai, BU) 
i i i 
< VA wMB' vill < LA ysll Blas. 
L 


Therefore sup >> |(Lu;, )| < inf ||A'Iys ll Blys- 
i 


as asserted. 


If {u;} is an orthonormal basis of V and L is of trace class, we can thus write 
L = B* Awith A and B Hilbert-Schmidt. We define the trace of L to be 


Tr L = 0, (Luj, uj) = 0, (B* Au;, uj) = 0; (Auj;, Bu;) = (A, B). 


The series }°, (Lu;, u;) is absolutely convergent by definition of trace class. The 
trace of L is independent of the orthonormal basis since it equals (A, B), and it 
is independent of A and B since it equals 0; (Luj, ui). 

In practice it is not so easy to check from the definition that L is of trace class. 
But there is a simple sufficient condition. 


Proposition 2.9. If L : V — V is a bounded linear operator on the Hilbert 
space V and if pare, |(Lu;, v;)| < oo for some orthonormal bases {u;} and {v;}, 
then L is of trace class. 


PROOF. Since |(Lu;, v;)| < ||L||, we have |(Lu;, v)|? < |L\||(Lu;, v;)| for 
all i and j, and it follows from the finiteness of ae, |(Lu;, v;)| that Pale — 
bee, \(Lu;, v;)|? is finite. Thus L is a Hilbert-Schmidt operator and has to be 
compact. 

If {e,} and { f;} are orthonormal bases, we expand e, = 0; (ex, u;)u; and fy = 


pay (fx, vj)v; and substitute to obtain (Le;, fy) = ban (ex, uj) (Luj, vj) (fe, v;)- 
Taking the absolute value and summing on k gives 


X \(Lex, fl SY [Lui v)| X (ex, ui) (fk, ¥j)I. 
J 
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Application of the Schwarz inequality to the sum onk and then Bessel’s inequality 
to each factor of the result yields 


Y lLer, fidl SX Lai, wpi(D Mees PY L1G PY” 
uJ 


< ¥ |(Lu;, v/)||luillllojll =O (Lui, vj)| < 00, 
i,j tJ 


and therefore L is of trace class. 


6. Problems 


1. Let (S, w) be ao-finite measure space, let f be in L°°(S, jz), and let My be the 
bounded linear operator on L?(S, 2) given by Mr(g) = fe. 
(a) Find a necessary and sufficient condition for My to have an eigenvector. 
(b) Find a necessary and sufficient condition for My to be compact. 


2. Let L be a compact operator on a Hilbert space, and let 2 be a nonzero complex 
number. Prove that if AJ — L is one-one, then the image of AJ — L is closed. 


3. Prove for a Hilbert space V that the normed linear space of Hilbert—Schmidt 
operators with the norm || - ||,5 is a Banach space. 


4. If Lisa trace-class operator on a Hilbert space V, let || L||~¢ equal the supremum 
of 5°; |(Lu;, v;)| over all orthonormal bases {u;} and {v;}. By Proposition 2.8 
this equals the infimum, over all Hilbert-Schmidt A and B such that L = B*A, 
of the product || Al y5 || B Ilys. Prove that the vector space of trace-class operators 
is a normed linear space under || - ||;¢ as norm. 


5. If ZL isa trace-class operator on a complex Hilbert space V and A is a bounded 
linear operator, prove that Tr AL = Tr LA and conclude that Tr(BLB~!) = Tr L 
for any bounded linear operator B. 


Problems 6-8 deal with some extensions of Theorem 2.3 to situations involving 

several operators. A bounded linear operator L is said to be normal if LL* = L*L. 

6. Suppose that {Z,} is a finite commuting family of compact self-adjoint operators 
on a Hilbert space. Prove that there exists an orthonormal basis consisting of 
simultaneous eigenvectors for all Ly. 

7. Fix acomplex Hilbert space V. 

(a) Prove that the decomposition L = 5(L + L*)+i x(L — L*) exhibits any 
normal operator L : V — V asa linear combination of commuting self- 
adjoint operators. 

(b) Prove that the operators in (a) are compact if L is compact. 

(c) State an extension of Theorem 2.3 that concerns compact normal operators 
on a complex Hilbert space. 


8. 
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Fix a Hilbert space V. 
(a) Prove that a unitary operator from V to itself is always normal. 
(b) Under what circumstances is a unitary operator compact? 


Problems 9-13 indicate an approach to second-order ordinary differential equations 
by integral equations in a way that predates the use of the Hilbert-Schmidt Theorem. 


9. 


10. 


11. 


12: 


13s 


For w ¥ 0, show that the unique solution w(t) on [a, b] of the equation u”+w*u = 
g(t) and the initial conditions u(a) = 1 and u'(a) = Ois 


u(t) =cos@(t —a) +a! f ¢(s)sinw(t —s)ds. 


Let p(t) be a continuous function on [a, b], and let u(t) be the unique solution 
of the equation u” + [w? — p(t)]u = 0 and the initial conditions u(a) = 1 and 
u'(a) = 0. Show that u satisfies the integral equation 


u(t)—w! [" p(s) sinw(t — s)u(s) ds = cos a(t — a), 


which is of the formu(t)— K(t, s)u(s) ds = f(t), where K (t, s) is continuous 

on the trianglea <s <t<b. 

Let K (t, s) be continuous on the triangle a < s < t < b. For f continuous on 

[a, b], define (Tf)(t) = f' K(t, s) f(s) ds. 

(a) Prove that f continuous implies Tf continuous. 

(b) Put M = max |K(t,s)|. If f has C = i | f (t)| dt, prove inductively that 
IT" AOL < Ay @-a)"" forn > 1. 

(c) Deduce that the series f + Tf +7 f +--+ converges uniformly on [a, b]. 

Setu = f +Tf +7? f +--- in the previous problem, and prove that u satisfies 

u—Tu=f. 

In the previous problem prove that u = f +7 f +7? f +--- is the only solution 

ofu—Tu=f. 


CHAPTER III 


Topics in Euclidean Fourier Analysis 


Abstract. This chapter takes up several independent topics in Euclidean Fourier analysis, all having 
some bearing on the subject of partial differential equations. 

Section | elaborates on the relationship between the Fourier transform and the Schwartz space, 
the subspace of L'(R) consisting of smooth functions with the property that the product of any 
iterated partial derivative of the function with any polynomial is bounded. It is possible to make 
the Schwartz space into a metric space, and then one can consider the space of continuous linear 
functionals; these continuous linear functionals are called “tempered distributions.” The Fourier 
transform carries the space of tempered distributions in one-one fashion onto itself. 

Section 2 concerns weak derivatives, and the main result is Sobolev’s Theorem, which tells how 
to recover information about ordinary derivatives from information about weak derivatives. Weak 
derivatives are easy to manipulate, and Sobolev’s Theorem is therefore a helpful tool for handling 
derivatives without continually having to check the validity of interchanges of limits. 

Sections 3-4 concern harmonic functions, those functions on open sets in Euclidean space that 
are annihilated by the Laplacian. The main results of Section 3 are a characterization of harmonic 
functions in terms of a mean-value property, a reflection principle that allows the extension to all of 
Euclidean space of any harmonic function in a half space that vanishes at the boundary, and a result 
of Liouville that the only bounded harmonic functions in all of Euclidean space are the constants. 
The main result of Section 4 is a converse to properties of Poisson integrals for half spaces, showing 
that harmonic functions in a half space are given as Poisson integrals of functions or of finite complex 
measures if their L? norms over translates of the bounding Euclidean space are bounded. 

Sections 5-6 concern the Calderén—Zygmund Theorem, a far-reaching generalization of the 
theorem concerning the boundedness of the Hilbert transform. Section 5 gives the statement and 
proof, and two applications are the subject of Section 6. One of the applications is to Riesz transforms, 
and the other is to the Beltrami equation, whose solutions are “quasiconformal mappings.” 

Sections 7-8 concern multiple Fourier series for smooth periodic functions. The theory is 
established in Section 7, and an application to traces of integral operators is given in Section 8. 


1. Tempered Distributions 


We fix normalizations for the Euclidean Fourier transform as in Basic: For f in 
L' (RY), the definition is 


FM=EAO)S i foe? dx, 
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with x - y referring to the dot product and with the 27 in the exponent. The 
inversion formula is valid whenever f is in L!; it says that f is recovered as 


fx) = (F'P\@) = [ Fae? dy 


almost everywhere, including at all points of continuity of f. The operator F 
carries L' M L? into L? and extends to a linear map F of L” onto L? such that 
IF fll, = Il fll. This is the Plancherel formula. 

The Schwartz space S = S(R”) is the vector space of all functions f in 
C™ (RY) such that the product of any polynomial by any iterated partial derivative 
of f is bounded. This is a vector subspace of L! NM L”, and it was shown in Basic 
that F carries S one-one onto itself. It will be handy sometimes to use a notation 
for partial derivatives and their iterates that is different from that in Chapter I. 
Namely,! let Dj = —: If a = (qj,...,a@y) is an N-tuple of nonnegative 

Jj 
integers, we write |a| = ee aj, oc! =ay!---ay!,x® = x1 --- x5", and D® = 
D{' --- Diy". Addition of such tuples a is defined component by component, and 
we say thata < fifa; < f; for! < 7 < N. We write |a| for the total 
order a; +--+ + ay, and we call w a multi-index. If Q(x) = )0,, dgx® is a 
complex-valued polynomial on R%, define Q(D) to be the partial differential 
operator )°, dgD® with constant coefficients obtained by substituting, for each 
j with | < j < N, the operator Dj = a for x;. The Schwartz functions are 


then the smooth functions f on R™ such that P(x)Q(D) f is bounded for each 
pair of polynomials P and Q. 

The Schwartz space is directly usable in connection with certain linear par- 
tial differential equations with constant coefficients. A really simple example 
concerns the Laplacian operator A = a treet ae 


A = |D|? in the new notation for differential operators. Specifically the equation 


which we can write as 


(—-A)ju=f 


has a unique solution u in S for each f in S. To see this, we take the Fourier 
transform of both sides, obtaining Fu— F(Au) = Ff or Fu—F(|D|*(u)) = Ff. 
Using the formulas relating the Fourier transform, multiplication by polynomials, 
and differentiation,” we can rewrite this equation as (1 + 477|y|*)F(u) = F(f). 
Problem 1 at the end of the chapter asks one to check that (1+47r?|y|?)~!¢ is in Sif 


a 


'Some authors prefer to abbreviate ax 


as 0;, reserving the notation Dj; for the product of 0; and 
a certain imaginary scalar that depends on the definition of the Fourier transform. 


These, with hypotheses in place, appear as Proposition 8.1 of Basic. 
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g isin S, and then existence of a solution in S to the differential equation is proved 
by the formula u = F~! (d +42? |y|?)"F(f)). For uniqueness let wu; and uz be 
two solutions in S corresponding to the same f. Then (1 — A)(u, —u2) = 0, and 
hence (1 + 4r?|y|*)F(u, — u2)(y) = 0 for all y. Therefore F(u; — u2)(y) = 0 
everywhere. Since F is one-one on S, we conclude that uw; = uz. 

A deeper use of the Schwartz space in connection with linear partial differential 
equations comes about because of the relationship between the Schwartz space 
and the theory of “distributions.” Distributions are continuous linear functionals 
on vector spaces of smooth functions, i.e., continuous linear maps from such a 
space to the scalars, and they will be considered more extensively in Chapter V. 
For now, we shall be content with discussing “tempered distributions,” the dis- 
tributions associated with the Schwartz space. In order to obtain a well-defined 
notion of continuity, we shall describe how to make S(IR”) into a metric space. 

For each pair of polynomials P and Q, we define 


IIfllp.o = sup |P(x)(Q(D) f)()|. 


xERN 


Each function || - || p 9 on S is aseminorm on S in the sense that? 


(i) If llp.9 = 0 forall f in S, 
Gi) Ilcfllp.9 = Iclilfllp,o for all f in S and all scalars c, 


Gii) If + gllp.o <I fllp.g tllgllp.g forall f and gin S. 

Collectively these seminorms have a property that goes in the converse direction 
to (i), namely 

(iv) If llp.o = 0 for all P and Q implies f = 0. 

In fact, f will already be 0 if the seminorm for P = Q = 1 isO on f. 

Each seminorm gives rise to a pseudometric dp,o(f,g) = If — gllp.o in 
the usual way, and the topology on S is the weakest topology making all the 
functions dp_g(-, g) continuous. That is, a base for the topology consists of all 
sets Us.p.o.n = (f I If —8llp.g < 1/n}- 

A feature of S is that only countably many of the seminorms are relevant for 
obtaining the open sets, and a consequence is that the topology of Sis defined by a 
metric. The important seminorms are the ones in which P and Q are monomials, 
each with coefficient 1. In fact, if P(x) = }°, dax® and Q(x) = Ve bpx®, then 
it is easy to check that dp o(f, g) < ag ldvbpldyo xe (f, g). Hence any open 
set that dp g defines is a union of finite intersections of the open sets defined by 
the finitely many d,« ys’s. 


3The reader may notice that the definition of “seminorm” is the same as the definition of 
“pseudonorm” in Basic. The only distinction is that the word “seminorm” is often used in the 
context of a whole family of such objects, while the word “pseudonorm” is often used when there is 
only one such object under consideration. 
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Let us digress and consider the situation more abstractly because it will arise 
again later. Suppose we have a real or complex vector space V on which are 
defined countably many seminorms || - ||,, Satisfying (i), (ii), and (iii) above. 

Each seminorm || - ||,, gives rise to a pseudometric dn on V and then to open 


sets defined relative to dn. For any pseudometric p, the function p = min{1, 6} 
is easily checked to be a pseudometric, and p defines the same open sets on V as 
p does. We shall use the following abstract result about pseudometrics; this was 
proved as Proposition 10.28 of Basic, and we therefore omit the proof here. 


Proposition 3.1. Suppose that V is a nonempty set and {d,,},>1 is a sequence 
of pseudometrics on V such that d, (x, y) < 1 for all n and for all x and y in V. 
Then d(x, y) = aa 2-"d,(x, y) is a pseudometric. If the open balls relative 


to d, are denoted by B,(r; x) and the open balls relative to d are denoted by 
Bir; x), then the B,,’s and B’s are related as follows: 


(a) whenever some By, (rn; x) is given withr, > O, there exists some B(r; x) 
with r > O such that B(r; x) C By(rn; x), 

(b) whenever B(r; x) is given with r > 0, there exist finitely many r, > 0, 
say forn < K, such that fess Bi(fni xX) © Bir; x). 


In the situation with countably many seminorms || - ||,, for the vector space V 
we see that we can introduce a pseudometric d such that three conditions hold: 

e d(x, y) =d(0, y — x) for all x and y, 

e whenever some x in V is given and an index n and corresponding number 
rn > 0 are given, then there is a number r > 0 such that d(x, y) <r 
implies ||y — x||, <‘n, 

e whenever some x in V is given and somer > 0 is given, then there exist 
finitely many r, > 0, say forn < K, such that any y with ||y —x||,, < Tn 
for n < K implies d(x, y) <r. 

If the seminorms collectively have the property that ||x||,, = 0 for all n only for 
x = 0, then d is a metric, and we say that the family of seminorms is a separating 
family. The specific form of d is not important: in the case of S, the metric d 
depended on the choice of the countable subfamily of pseudometrics and the order 
in which they were enumerated, and these choices do not affect any results about 
S. The important thing about this construction is that it shows that the topology 
is given by some metric. 

The three conditions marked with bullets enable us to detect continuity of 
linear functions with domain V and range another such space W by using the 
seminorms directly. 


Proposition 3.2. Let L : V — W bea linear function between vector spaces 
that are both real or both complex. Suppose that V is topologized by means of 
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countably many seminorms || - || y,,, and W is topologized by means of countably 
many seminorms || - || y,,- Then L is continuous if and only if for each n, there 
is a finite set F = F(n) of m’s and there are corresponding positive numbers 6,, 
such that ||v|l yj, < 5m for all m € F implies ||L(v)|ly,, < 1. 


PROOF. Let dy and dy be the distance functions in V and W. When n is 
given, the second item in the bulleted list shows that there is some r > 0 such 
that dy (0, w) < r implies ||wl||y., < 1. If L is continuous at 0, then there is a 
5 > O such that dy (0, v) < 6 implies dyw(O, L(v)) <7. From the third item in 
the bulleted list, we know that there is a finite set F of indices m and there are 
corresponding numbers 6,, > O such that ll ym < dm implies dy (0, v) < 6. 


Then || Ul yj, < 4m for all m in F implies ||L(v) ly, < 1. 

Conversely suppose for each n that there is a finite set F and there are numbers 
6m > 0 form in F such that the stated condition holds. To see that L is continuous 
at 0, let € > O be given. Choose K and numbers «, > 0 forn < K such 
that Ilwllwon < €, forn < K implies dw(0,w) < €. For eachn < K, the 


given condition on L allows us to find a finite set F,, of indices m and numbers 
dm > O such that |lully., < dm implies ||L(v)llw, < 1. If Wlullyn < Smén 


for all m in F = U, ex Fn, then ||L(v)|ly, < €n for all n < K and hence 
dw (0, L(v)) < €. We know that there is a number 5 > 0 such that dy(0,v) <6 
implies WUllysm < Sm€n for all m in F,, and then dw(O, L(v)) < €. Hence L is 
continuous at 0. 

Once L is continuous at 0, it is continuous everywhere because of the translation 
invariance of dy and dw: dy (vy, v2) = dy (0, v2 — v1) and dw(L(v), L(v2)) = 
dw (0, L(v2) — L(v1)) = dw, L(v2 — v1)). 


Now we return to the Schwartz space S to apply our construction and Propo- 
sition 3.2. The bulleted items above make it clear that it does not matter which 
countable set of generating seminorms we use nor what order we put them in; the 
open sets and the criterion for continuity are still the same. The following corollary 
is immediate from Proposition 3.2, the definition of S, and the behavior of the 
Fourier transform under multiplication by polynomials and under differentiation. 


Corollary 3.3. For the Schwartz space S on R” , 


(a) a linear functional £ is continuous if and only if there is a finite set 
F of pairs (P, Q) of polynomials and there are corresponding numbers 
5p,9 > O such that || fll p. 9 < dp, forall (P, Q) in F implies |€(f)| < 1. 

(b) the Fourier transform mapping F : S — S is continuous, and so is its 
inverse. 


A continuous linear functional on the Schwartz space is called a tempered 
distribution, and the space of all tempered distributions is denoted by S’ = 
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S'(R%). It will be convenient to write (7, ¢) for the value of the tempered 
distribution T on the Schwartz function g. The space of tempered distributions 
is huge. A few examples will give an indication just how huge it is. 


EXAMPLES. 

(1) Any function f on R% with | f(x)| < (+ |x|?)"|g(x)| for some integer n 
and some integrable function g defines a tempered distribution T by integration: 
(T,g) = Fan f(x)g(x) dx when g is in S. In view of Corollary 3.3a, the 
continuity follows from the chain of inequalities 


I(T, 0)| < few (FG) + lxP7)-") (+ [x 7)"1e(x)]) dx 
< (fev lg) dx) (sup, {C1 + lx|?)"1e(x)]}) 
=IKIlglllgllp, for P@) = (+ IxI?)”. 


(2) Any function f with | f (x)| < (1+|x|?)"|g(x)| for some integer n and some 
function g in L©(R ) defines a tempered distribution T by integration: (T, g) = 
Saw f ~)~x) dx. In fact, |f(x)| < (+ [xP)"*"( + Ix?) g@)I), and 
(1+|x|?)~"|g(x)| is integrable; hence this example is an instance of Example 1. 

(3) Any function f with | f(x)| < (1 + |x|?)"|g(x)| for some integer n and 
some function g in L?(R%), where | < p < 00, defines a tempered distribution 
T by integration because such a distribution is the sum of one as in Example 1 
and one as in Example 2. 


(4) Suppose that f is as in Example 3 and that Q(D) is a constant-coefficients 
partial differential operator. Then the formula (7, g) = San f(x) (Q(D)@)(x) dx 
defines a tempered distribution. 


(5) In the above examples, Lebesgue measure dx may be replaced by any Borel 
measure d(x) on R™ such that Sen (1 + |x|?)"° d(x) < 00 for some no. A 
particular case of interest is that djz(x) is a point mass at a point x9; in this case, 
the tempered distributions g +> (T, g) that are obtained by combining the above 
constructions are the linear combinations of iterated partial derivatives of ¢ at the 
point xo. 

(6) Any finite linear combination of tempered distributions as in Example 5 is 
again a tempered distribution. 


Two especially useful operations on tempered distributions are multiplication 
by a Schwartz function and differentiation. Both of these definitions are arranged 
to be extensions of the corresponding operations on Schwartz functions. The 
definitions are (WT, v) = (T, Wg) and (D“T, gy) = (—1)!*|(T, D%@); in the 
latter case the factor (—1)'! is included because integration by parts requires its 
presence when T is given by a Schwartz function. 
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A useful feature of distributions in connection with differential equations, as we 
shall see in more detail in later chapters, is that we can first look for solutions of a 
given differential equation that are distributions and then consider how close those 
distributions are to being functions. The special feature of tempered distributions 
is that the Fourier transform makes sense on them, as follows. 

As with the operations of multiplication by a Schwartz function and differen- 
tiation, the definition of Fourier transform of a tempered distribution is arranged 
to be an extension of the definition of the Fourier transform of a member wy of 
S when we identify the function w with the distribution (x) dx. If g is in S, 
then [ Vo dx = { w@dx by the multiplication formula,* which we reinterpret 
as (F(w dx), v) = (Ww dx, @). The definition is 


(F(T), 9) = (T.@) 


for T € S’ and g € S. To see that F(T) is in S’, we have to check that 
F(T) is continuous. The definition is #(T) = T o F¥, and Fis continuous on S 
by Corollary 3.3b. Thus the Fourier transform carries tempered distributions to 
tempered distributions. 


Proposition 3.4. The Fourier transform F is one-one from S’(R”) onto 
S’(RY). 

PROOF. If T is in S’ and F(T) = 0, then (7, F(g~)) = 0 for all gy in S. Since 
F carries S onto S, (T, Ww) = 0 for all w in S, and thus T = 0. Therefore F is 
one-one on S’. 

If T’ is given in S’, put T = T’ o F~!, where F~! is the inverse Fourier 
transform as a map of S to itself. Then T’ = To Fand F(T) =ToF=T"’. 
Therefore F is onto S’. 


2. Weak Derivatives and Sobolev Spaces 


A careful study of a linear partial differential equation often requires attention 
to the domain of the operator, and it is helpful to be able to work with partial 
derivatives without investigating a problem of interchange of limits at each step. 
Sobolev spaces are one kind of space of functions that are used for this purpose, 
and their definition involves “weak derivatives.’ At the end one wants to be 
able to deduce results about ordinary partial derivatives from results about weak 
derivatives, and Sobolev’s Theorem does exactly that. 

We shall make extensive use in this book of techniques for regularizing func- 
tions that have been developed in Basic. Let us assemble a number of these in 
one place for convenient reference. 


4Proposition 8.le of Basic. 
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Proposition 3.5. 


(a) (Theorems 6.20 and 9.13) Let g be in L'(R%, dx), define g(x) = 
e ‘@(e—'!x) for e > 0, and put c = Jpn G(x) dx. 


(i) If f isin L?(R%, dx) with 1 < p < ow, then 
an IIpe * f —cf ||, =9. 


(ii) If f is bounded on R™ and is continuous at x, then lim, jo(Ge* f(x) = 
cf (x), and the convergence is uniform for any set FE of x’s such that 
f is uniformly continuous at the points of EF. 


(b) (Proposition 9.9) If uz is a Borel measure on a nonempty open set U in 
RY and if 1 < p < o, then L?(U, 2) is separable, and Coom(U) is dense in 
L?(U, pL). 

(c) (Corollary 6.19) Suppose that g is a compactly supported function of 
class C” on RN and that f is in L?(R%, dx) with 1 < p < oo. Theng x f is of 
class C”, and D°(g « f) = (D%@) « f for any iterated partial derivative D% of 
order <n. 

(d) (Lemma 8.11) If 6; and 62 are given positive numbers with 6; < 62, then 
there exists y in Ce dR) with values in [0, 1] such that W(x) = Wo(|x|), Wo Is 
nonincreasing, w(x) = 1 for |x| < 6), and w(x) = 0 for |x| > do. 

(e) (Consequence of (d)) If 6 > 0, then there exists p > 0 in C® (R™) 
such that g(x) = ¢o(|x|) with gp nonincreasing, g(x) = 0 for |x| > 1, and 
Jpn g(x)dx = 1. 

(f) (Proposition 8.12) If K and U are subsets of R“ with K compact, U 
open, and K C U, then there exists g € Cov.,(U) with values in [0, 1] such that 
g is identically lon K. 


In this section we work with a nonempty open subset U of R’, an index p 
satisfying 1 < p < oo, and the spaces L?(U) = L?(U, dx), the underlying 
measure being understood to be Lebesgue measure. Let p’ = p/(p — 1) be the 
dual index. For Sobolev’s Theorem, we shall impose two additional conditions on 
U ,namely boundedness for U and a certain regularity condition for the boundary 
dU = U“—U of the open set U , but we do not impose those additional conditions 
yet. 


Corollary 3.6. If U is anonempty open subset of R”, then CX 


(a) uniformly dense in Coom(U), 
(b) dense in L?(U) for every p with | < p < w. 


(U) is 


PROOF. Let f in Coom(U) be given. Choose by Proposition 3.5e a function 


ginc oo URY ) that is > 0, vanishes outside the unit ball about the origin, and 
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has total integral 1. For ¢ > 0, define g(x) = e~% g(e—'x). The function 
g. * f is of class C~ by (c). If U = RX, let ¢9 = 1; otherwise let eg be the 
distance from the support of f to the complement of U. For e < &, g, * f has 
compact support contained in U. As é€ decreases to 0, Proposition 3.5a shows 
that ||g, * f — F heap tends to 0 and so does ||g; *« f — Fil This proves the first 
conclusion of the corollary and proves also that C3, (U) is L? dense in Ceom(U) 
if 1 < p < ow. Since Proposition 3.5b shows that Coom(U) is dense in L?(U), 
the second conclusion of the corollary follows. 


Suppose that f and g are two complex-valued functions that are locally inte- 
grable on U in the sense of being integrable on each compact subset of U. If a 
is a differentiation index, we say that D°f = g in the sense of weak derivatives 
if 


[ feotoeax = ole f g(xjy(x)dx — forallg € CX (U). 
U U 


The definition is arranged so that g gives the result that one would expect 
for iterated partial differentiation of type a if the integrated or boundary term 
gives 0 at each stage. More precisely if f is in C!*!(U), then the weak derivative 
of order a exists and is the pointwise derivative. To prove this, it is enough to 
handle a first-order partial derivative Djh for a function h in C!(U), showing that 
Jy hDjedx = — fy (Djh)g dx for g € C&,,(U), ie., that f,, Dj(hg) dx = 0. 
Because g is compactly supported in U, yy = hg makes sense as a compactly 
supported C! function on R%, and we are to prove that Jan Dj dx = 0. The 
Fundamental Theorem of Calculus gives ee Dj dxj = ivi fora > 0, 
and the compact support implies that this is 0 for a sufficiently large. Thus 
Jig Djw dx; = 0, and Fubini’s Theorem gives [py Djw dx = 0. 

The function g in the definition of weak derivative is unique up to sets of 
measure 0 if it exists. In fact, if g; and g> are both weak derivatives of order a, then 
ie (g1 — g2)gdx = 0 for all g in C&.(U). Fix an open set V having com- 
pact closure contained in U. If f is in Coom(V), then Corollary 3.6a pro- 
duces a sequence of functions g, in C&,(V) tending uniformly to f. Since 
81 — go is integrable on V, the equalities J v (81 — 82)¢n dx = 0 for all n imply 
J vy (81 — 82) f dx =0. By the uniqueness in the Riesz Representation Theorem, 
81 = go ae.on V. Since V is arbitrary, g; = gz ae. on U. 


EXAMPLE. In the open set U = (—1,1) © R!, the function e’/|*! is locally 
integrable and is differentiable except at x = 0, but it does not have a weak 
derivative. In fact, if it had g as a weak derivative, we could use ¢’s vanishing in 
neighborhoods of the origin to see that g(x) has to be —ix~?(sgn x)e!/!*! almost 
everywhere. But this function is not locally integrable on U. 
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If f has a" weak derivative D’f and D@f has 6" weak derivative D’(D“f), 
then f has (6B + w)" weak derivative D?+“f and D?+*f = D®(D“f). In fact, if 
gy is in C&,,(U), then this conclusion follows from the computation 


ty f DP pdx = fy f D*(D’ gy) dx = (-1)""| f,, D°f D¥p dx 
= (—])/!+14| i D?(D°f dx. 


If f has weak j" partial derivative D; f and if yw isinC°(U), then fy has a 
weak j" partial derivative, and it is given by (Dj fow+ f(D;W). In fact, this con- 
clusion holds because f,, f ¥(Djy) dx = fy f Di(we) dx — fy fDiWedx = 
— fy Di fweadx — fy fDiWedx =— fy FD) + Di fe ax. 

If f has 6" weak derivative D’f for every B with B < a andif y isinC™(U), 
then fw has an a weak derivative. It is given by the Leibniz rule: 


a! 
D* = 5° —____(p’f)(D* # yp). 
(fv) > Be p | f)(D* Py) 


This formula follows by iterating the formula for D; (fw) in the previous para- 
graph. 

Now we can give the definition of Sobolev spaces. Let k > 0 be an integer, 
and let 1 < p < ~w. Define 


LO)s {f € LP(U) | all D“f exist weakly for |w| < k and are in L?(U)}. 


Then L?(U) is a vector space, and we make it into a normed linear space by 


defini 
efining os 
Iflug = (Xf i*riras) 


|a|<k 


The normed linear spaces L 4 (U) are the Sobolev spaces for U. All the remaining 
results in this section concern these spaces.” 


Proposition 3.7. If k > 0 is an integer and if 1 < p < ow, then the normed 
linear space L ‘8 (U) is complete. 


>The subject of partial differential equations makes use of a number of families that generalize 
these spaces in various ways. Of particular importance is a family H* such that H* = Li when s is 
an integer k > 0 but s is a continuous real parameter with —oo < s < 00. The spaces H’(R™) are 
introduced in Problems 8-12 at the end of the chapter. For an open set U, the two spaces H3,,,(U) 


and Hj,,,(U) are introduced in Chapter VIII. All of these spaces are called Sobolev spaces. 
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PROOF. If { f;,} is a Cauchy sequence in LP), then for each aw with |a| < k, 
the sequence {D°f,,} is Cauchy in L?(U). Since L?(U) is complete, we can 
define f to be the L?(U) limit of D“f,,. For g in C& (U), we then have 


com 
Sy f@geax = fy dim, D% fing dx = lim fy (D%fn)e ax, 


the second equality holding since g is in the dual space L? (U). In turn, this 
expression is equal to 


(—1)!*! lim, f, (fn) (D%9) dx = (—1)"" fi, (f)(D%g) dx, 


the second equality holding since D%g is in L? (U). Therefore f™ = D%f© 
and fim tends to f in LP(U). 


Proposition 3.8. If k > 0 is an integer and if 1 < p < o, then a function f 
is in LP) if f isin L?(U) and there exists a sequence { f,} in C*(U) such that 


(a) limp lf — fllp = 9, 
(b) for each a with |a| < k, the iterated pointwise partial derivative Df, is 
in L?(U) and converges in L?(U) as m tends to infinity. 


PROOF. By (b), ||D* (fi — fallb for each fixed a tends to 0 as / and m tend to 
infinity. Summing on and taking the p™ root, we see that || f7 — fin | LP tends to 0. 
k 


In other words, { fj,} is Cauchy in L . (U). By Proposition 3.7, { fin} converges to 
some g in L/,(U). The limit function g has to have the property that || f,, — ¢|| i 
tends to 0, and (a) shows that we must have g = f. Therefore f is in L 4 (U). 


The key theorem is the following converse to Proposition 3.8. 


Theorem 3.9. If k > 0 is an integer and if 1 < p < oo, thenC™(U) NL?) 
is dense in EO): 


On the other hand, despite Corollary 3 .6b, it will be a consequence of Sobolev’s 
Theorem that C&°,,(U) is not dense in L? (U) if k is sufficiently large. The proof 
of the present theorem will be preceded by a lemma affirming that at least the 
members of L?(U) with compact support in U can be approximated by members 
of CX (U). 

In addition, the proof of the theorem will make use of an “exhausting sequence” 
and a smooth partition of unity based on it. Since U is locally compact and 
o-compact, we can find a sequence {K,}°° , of compact subsets of U with union 


U such that K, C K7, forall. This sequence is called an exhausting sequence 
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for U. We construct the partition of unity {w,},>1 as follows. Forn > 1, we use 
Proposition 3.5f to choose a C™ function g, with values in [0, 1] such that 


1 for x € K3, 
gi(x) = aid 
0 for x € (Kz)°, 
and for n > 2, 
1 for x € Ky42 — K°,,, 
ont) = | a i n+1 
0 for x € (K7,3)° U Kn. 


In the sum )°”° | @, (x), each x has a neighborhood in which only finitely many 
terms are nonzero and some term is nonzero. Therefore gy = )~~, gn is a 
well-defined member of C°(U). If we put Wy, = Qn /9; then yy, is in C™(U), 
paar Wn = lon U, W(x) is > 0 on K3 and is = 0 on (K2)°, and for n > 2, 


>0 forx € Kni2— Ky, 


nae =0 forx € (K°,,)°U Kn. 


Lemma 3.10. Let g be a member of C oR ) vanishing for |x| > 1 and 
having total integral 1, put g(x) = e~“g(e7!x) for ¢ > 0, and let f bea 
function in L? (U) whose support is a compact subset of U. For ¢ sufficiently 
small, g, * f isin C& (U), and 


com 
lim Ile * f — filpp =9. 


PROOF. As in the proof of Corollary 3.6, g, + f has compact support contained 
in U if € < &, where é& is 1 if U = RN and ¢p is the distance of the support 
of f to the complement of U if U 4 R%. Moreover, the function yg, « f is in 
C™(R) with D*(g, * f) = (D%g,) * f for each a. Thus y, * f is in CentU) 
if € < &. By the first remark after the definition of weak derivative, g, * f 
has weak derivatives of all orders for ¢ < &g, and they are given by the ordinary 


derivatives D°(p, « f). For € < &, 
D° (ge * f(x) = fy FY) (D% Ge) (a — y) dy 
= (-1)! f,, fO)D*(y & g(x — y)) dy. 


Since f by assumption has weak derivatives through order k and since y > 
-(x — y) has compact support in U, the right side is equal to 


Sy D°f We (x — y) dy = (Ye * D*f)(x) 
for |a| < k. Therefore, for ¢ < €9 and |a| < k, we have 
| D° (ge * f — Pll, = IlGe * (D°F) — D*F Il, - 


For these same w’s, Proposition 3.5a shows that the right side tends to 0 as ¢ tends 
to 0. Therefore yg, « f — f tends to 0 in Es 
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PROOF OF THEOREM 3.9. Let f be in L i‘ (U). The idea is to break f into a 
countable sum of functions of compact support, apply the lemma to each piece, 
and add the results. The difficulty lies in arranging that each of the pieces of f 
have controlled weak derivatives through order k. Thus instead of using indicator 
functions to break up f, we shall use an exhausting sequence {K,}n>1 and an 
associated partition of unity {W,}n>1 of the kind described after the statement of 
the theorem. The discussion above concerning the Leibniz rule shows that each 
Wnf has weak derivatives of all orders < k, and the construction shows that wy, f 
has support in Kg forn = | andin K?, 4, — Ky,_; forn > 2. 

Let € > 0 be given, let g be a member of C eR ) vanishing for |x| > 1 and 
having total integral 1, and put g(x) = e~“ g(e~!x) fore > 0. Applying Lemma 
3.10 to wf, choose €, > 0 small enough so that the function u, = @z, * (Wn f) 
has support in Kg for n = 1 andin K?, , — K,_; forn > 2 and so that 


len — nfl < 27%. 


Put u = 5°, uy. Each x in U has a neighborhood on which only finitely many 
of the functions u,, are not identically 0, and therefore u is in C°(U). Also, 


w=) (n—Wnf)+f — since 7p = 1. 
n=1 n=1 


Since for each compact subset of U, only finitely many u, — w,f are not 
identically 0 on that set, the weak derivatives of order < k satisfy DYu = 
ee, D* (un — Wn f) + Df. Hence 


D*u— f) = >° D* a — taf). 


n=1 


Minkowski’s inequality for integrals therefore gives 


3 


|D°u — fly SD ID% Gn — Va Illy S Do lltn — Vn filly S = =«. 
n=1 n=1 n=1 


Finally we raise both sides to the p™ power, sum for a with |a| < k, and extract 
the p™ root. If m(k) denotes the number of such a’s, we obtain 


and the proof is complete. 
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Now we come to Sobolev’s Theorem. For the remainder of the section, the 
open set U will be assumed bounded, and we shall impose a regularity condition 
on its boundary JU = U‘! — U. When we isolate one of the coordinates of 
points in R’, say the j", let us write y’ for the other N — 1 coordinates, so that 
y = (yj, y’). We say that U satisfies the cone condition if there exist positive 
constants c and / such that for each x in U, there are a sign + and an index j 
with | < j < N for which the closed truncated cone 


T,=x+ {y = (yy) | + y; > cly’| and |y| < h} 


lies in U for one choice of the sign +. See Figure 3.1. Problem 4 at the end of the 
chapter observes that if the bounded open set U has a C! boundary in a certain 
sense, then U satisfies the cone condition. 


yi 


ry 


X / 


yy: 


FIGURE 3.1. Cone condition for a bounded open set. 


Theorem 3.11 (Sobolev’s Theorem). Let U be a nonempty bounded open set 
in RY, and suppose that U satisfies the cone condition with constants c and h. 
If 1 < p < wandk > N/p, then there exists a constant C = C(N, c,h, p,k) 
such that 
sup |u(x)| < Cul,» 
xeU a 


for all u in C~™(U) ALP). 


REMARK. Under the stated conditions on k and p, the theorem says that the 
inclusion of C°(U) NL i (U) into the Banach space C (U) of bounded continuous 
functions on U is a bounded linear operator relative to the norm of L?(U). Since 
c’@(U)N Li) is dense in LP) by Theorem 3.9 and since C (U) is complete, 
the inclusion extends to a continuous map of L : (U) into C(U). In other words, 
every member of L/?}(U) can be regarded as a bounded continuous function on 
U. 


PRooF. Fix g in C&.,(R!) with g(t) equal to 1 for |t| < 5 and equal to 0 for 


com 
3 


|¢| => 7. Fixx in U and its associated sign + and index j. We introduce spherical 
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coordinates about x with the indices reordered so that j comes first, writing x + y 
for a point near x with 


yj = =r cos¢, 
yy =r singcos 6, 
(with y; omitted) 


Yn-1 =F Sing Sin 9 - - - sin Oy_3 COs Oy_2, 


yy =r sing sin @ ---sinOy_3 sin Oy_2, 


when 

O<@<n, 

0 <0; <2 fori < N-2, 

0 < On_2 < 20. 
All the points x + y withO < g < ®(c), where ®(c) is some positive number 
and0Q <r <h, lie inthe cone I’, at x. For such g’s and for 0 < t < 1, we define 


F(t) = g(f)u(x + (4t cos, t sing cos 6, ...)) 


and expand F in a Taylor series through order k — 1 with remainder about the 
point t = h. Because of the behavior of g, F and all its derivatives vanish at 
t =h. Therefore F(t) is given by the remainder term: 


FO = gp i ¢ — 9 FOO) ds. 


Putting t = 0, we obtain 


w(x) = ey In I ela (Ge(x + Od) ar 
= 2 fo FN Ble(E)u(x + --))]e¥ 1 ar. 


We regard the integral on the right side as taking place over the radial part of the 
spherical coordinates that describe the set of y’s in ,, and we want to extend 
the integration over all of I... To do so, we have to integrate over all values 
of 6,,...,9y—2 and for 0 < g < ®(c). We multiply by the spherical part of 
the Jacobian determinant for spherical coordinates and integrate both sides. The 
integrand on the left side is constant, being independent of y, and gives a positive 
multiple of u(x). Dividing by that multiple, we get 


w(x) =e fp _, IY Sle (Hue + yd] ay. 
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Suppose temporarily that p > 1. With p’ still denoting the index dual to p, 
application of Holder’s inequality gives 


u(x) <er( fp, Ly” ay)” (f._, |2fe(ue + yd]? ay)”. 


The first integral on the right side is the critical one. The radius extends from 
0 to A, and the integral is finite if and only if (k — N)p’ > —N > 0, ie., 
k > N—N/p’=N/p. This is the condition in the theorem. 

The differentiation a in the second factor on the right can be expanded in 
terms of derivatives in Cartesian coordinates, and then the integration can be 
extended over all of U. The result is that the second factor is dominated by a 


multiple of ||w|| LP This completes the proof when p > 1. 
Now suppose that p = 1. Then the above result from applying Holder’s 
inequality is replaced by the inequality 
k , 
ea)! Serv’ ore Stn Lala (Ge + 9d] | ay. 
The first factor is finite if k > N, and the second factor is handled as before. This 
completes the proof if p = 1. 


Corollary 3.12. Suppose that U is a nonempty bounded open subset of RY 
satisfying the cone condition, and suppose that 1 < p < oo and that m and k are 
integers > 0 such thatk > m+ N/p. If f isin EO, then f can be redefined 
on a set of measure 0 so as to be inC™ (U). 


PROOF. Choose by Theorem 3.9 a sequence { f;} in C°(U)NL : (U) such that 
lim f; = f in L?(U). For |a| < m, we apply Theorem 3.11 to see that 


sup |D“f; — D°f;| 
U 


tends to 0 as i and j tend to infinity. Thus all the D“f; converge uniformly. It 
follows that the uniform-limit function f = lim f; isin C’(U). Since fi > f 
in L?(U) and f; > f uniformly, we conclude that f = f almost everywhere. 
Thus f tells how to redefine f ona set of measure 0 so as to be inC”(U). 


3. Harmonic Functions 


Let U be an open set in R% . The discussion will not be very interesting for N = 1, 
and we exclude that case. A function u in C*(U) is harmonic in U if Au = 0 
identically in U. Harmonic functions were introduced already in Chapter I and 
investigated in connection with certain boundary-value problems. In the present 
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section we examine properties of harmonic functions more generally. Harmonic 
functions in a half space, through their boundary values and the Poisson integral 
formula, become a tool in analysis for working with functions on the Euclidean 
boundary, and the behavior of harmonic functions on general open sets becomes 
a prototype for the behavior of solutions of further “elliptic” second-order partial 
differential equations. 

Harmonic functions will be characterized shortly in terms of a certain mean- 
value property. To get at this characterization and its ramifications, we need the 
N-dimensional “Divergence Theorem” of Gauss for two special cases—a ball 
and a half space. The result for a ball will be formulated as in Lemma 3.13 
below; we give a proof since this theorem was not treated in Basic. The argument 
for a half space is quite simple, and we will incorporate what we need into the 
proof of Proposition 3.15 below. For the case of a ball, recall® that the change- 
of-variables formula x = rq, withr > 0 and |w| = 1, for transforming integrals 
in Cartesian coordinates for R™ into spherical coordinates involves substituting 
dx = rN—! dr dw, where dw is a certain rotation-invariant measure on the unit 
sphere S‘~—! that can be expressed in terms of N — 1 angular variables. The 
open ball of radius xo and radius r is denoted by B(r; x9), and its boundary is 
OB(r; Xo). 


Lemma 3.13. If F is aC! function in an open set on R containing the closed 
ball B(r; 0)°! and if 1 < j < N, then 


/ oF / N-2 
— (x9 +x) dx = xjF (xo +ro)r do. 
xeB(r:0) OX; wedB(r:0) 

REMARKS. The usual formula of the Divergence Theorem is /, y divFdx = 
ae (F -n)dS, where U is a suitable bounded open set, IU = U* — U is its 
boundary, n is the outward-pointing unit normal, F is a vector-valued C! function, 
and dS is surface area. In Lemma 3.13, U is specialized to the ball B(r; 0), dS 
is the (N — 1)-dimensional area measure r~! dw on the surface 0 B(r; 0) of the 
ball, F is taken to be the product of F by the j™ standard basis vector e;, and 
ej ‘nis eT 

PROOF. Without loss of generality, we may take 7 = 1 and x9 = 0. Write 
x = (x1, x"), where x’ = (%2,...,xXy), and write w = (@,, w’) similarly. The 
left side in the displayed formula is equal to 


Jr? —|x'/? OF 1 ! 
Swier ie aT 5 1 2") dx dx 


|x'|<r [ 


F(/r2 — |x'|?, x’) — F(-vr2 -— [x2 2) |e", 


6From Section V1.5 of Basic. 
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Thus the lemma will follow if it is proved that 


f FGPAWR x dx’ = ff x F(ro)r"? do (x) 
Ix'|<r |o|=1, w1>0 
and 
— f Fevr?=|z'2, x) dx’ = f  xF(ro)r§~ da. (=) 
|x’|<r |@|=1, <0 


Let us use ordinary spherical coordinates for w, with 


r cos 0 
ro| r sin 0; cos 62 
ron r sin 0,--- sin O@y_2 Cos Oy_1 


r sin 6,--- sin@y_2 sinOy—1 


and 
dw = sin’~* 6, sin’? 6, --- sin@y_>d0,---dOy_4. 


The right side of () is equal to 
q F(ra)or\~* dw 


|o|=1, @20 
= f F(ra)r§—! cos 6, sin’ 6 sin ~* 6) - -- sin @y_2 d0, ---dOy_-1, 
0<6,<7/2, 
0<0;<z for 1<j<N—1, 
0<O0y_1<20 
and we show that it equals the left side of («) by carrying out for the left side of 
(«) the change of variables x’ <> (6),..., @y_—1) given with r constant by 


r sin 8; cos 6 
x2 


r sin 0, --- sin Oy_2 Cos Oy_1 
XN ‘ , i 
r sin 0,-+- sin O@y_2 sin Oy_1 


The Jacobian matrix is the same as for the change to spherical coordinates 
(r, 02,..., 9n—1) except that the first column has a factor r cos 6; instead of 1 
and the other columns have an extra factor of sin 6. Consequently 


dx’ = ene ( cos 6) | sin’ ~? 0) (sin? O2-+° sin 6y-2) d0,---dOn_). 


Therefore the measures match in the two transformed sides, the sets of integration 
for (0), ..., @y_1) are the same, and the integrands are the same because cos 6, = 
| cos 6;|. This proves («). For («:) we make the same computation but the interval 
of integration for 0; is 7/2 < 0, < 2. To get a match, the minus sign is necessary 
because cos 0; = —|cos 0;|. 
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Proposition 3.14 (Green’s formula’ for a ball). Let B be an open ball in R”, 
let 0B be its surface, and let do be the surface-area measure of 0B. If u and v 
are C” functions in an open set containing B“, then 


dv Ou 
Av—vA = hab orn peeks 
[uw v — vAu) dx [. (u ai Uv ) do, 


where n : 0S — RN is the outward-pointing unit normal vector. 


Proor. Apply Lemma 3.13 to F =u jv and then to F = v ju and subtract 
J J 
the results. Then sum on /. 


Let Qy_, be the surface area J. nt da of the unit sphere in R”. A continuous 
function uv on an open subset U of RY is said to have the mean-value property 
in R if the value of u at each point x in U equals the average value of u over each 
sphere centered at x and lying in U, ie., if 


1 
/ u(x +tw)dw 
Qn] weSN-! 


for every x in U and for every positive t less than the distance from x to U°. 
The mean-value property over spheres implies a corresponding average-value 
property over balls. In fact, the volume |B(fo; 0)| of the ball B(to; 0) is given by 
is Ssv-t tN—! dwdt = N7'1Y Ssv-1 dw = N~!t’ Qy_1. When the mean-value 
property over spheres is satisfied and fg is less than the distance from x to U, we 
can apply the operation Nt. x ih (—) dt to both sides of the mean-value formula 
and obtain 


u(x) = 


Ni™ i 
2 / /, u(x+tw)t |! dodt = ———— 
Qn-1 Jo Jwesn-! |B(to; 9)| J B(%:0) 


Proposition 3.15 (Green’s formula for a half space). Let R* be the subset of 
RY = {(x', xn) | x’ € RY! and x, € R} where x, > 0. Denote its boundary by 
dR+ = R~!, and suppose that u and v are C? functions on an open subset of 


u(x) = 


u(x+y)dy. 


R—! containing (R*)*! and that at least one of u and v is compactly supported. 
Then P 9 
wAv—vawdx = | (v Rn ~) dx’ 
xeRt x/ERN-! IXn OXn 


PROOF. Suppose F is a C! function compactly supported on an open subset of 
R—! containing (R*)". If 1 < 7 < N—1,then figs oF dx = Osince the integral 
J 


7This formula is related to but distinct from the formula with the same name at the beginning of 
Section I.3. 
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with respect to dx; is the difference between two values of F and since these are 0 
by the compactness of the support. For j = N, however, one of the boundary 
terms may fail to be 0, and the result is that f,. ii dx = — fan-1 F(x’) dx’. 


Apply the j® of these formulas first to F = ue “ and then to F = Van? sum 


the results on j, and subtract the two sums. The. ‘result j is the fecintla ‘of the 
proposition. 


Theorem 3.16. Let U be an open set in R% , and let u be a continuous scalar- 
valued function on U. If u is harmonic on U, then u has the mean-value property 
on U. Conversely if u has the mean-value property on U, then u is in C°(U) 
and is harmonic on U. 


PROOF. Suppose that u is harmonic on U. We prove that u has the mean-value 
property. It is enough to treat x = 0. Green’s formula, as in Proposition 3.14, 
directly extends from balls to the difference of two balls.’ Thus we have 


J, uAv —vAu)dx = f,, (u® —v #4) do (*) 


whenever E is a closed ball B,; of radius t contained in U or is the difference 
B, — (B.)° of two concentric balls with e < t. Taking E = B, and v = 1 in (+), 
we obtain 

ton, do =0. (46) 


Routine computation shows that the function given by 


ja for N > 2, 
log |x| for N = 2, 


is harmonic for x 4 0 and has 2% ~ equal to a nonzero multiple of |. |~ (V-)) + being 
the spherical coordinate adige ‘al. If we apply (*) to this v and our harmonic u 
when E = B, — (B.)°, we obtain 


Soce,—cweey (4 (u ae vit) do =0. 


Since v depends only on |x|, (**) shows that the second term of the integrand 
yields 0. Thus this formula becomes 


haces use do =0. 

8For the extended result, suppose that the balls have radii, < rz. Then u and v are defined from 
radius rj — € tor2 + € for some € > 0. We can adjust u and v by multiplying by a suitable smooth 
function that is identically 1 for radius > r; — ze and identically 0 for radius < ry — de, and then 
u and v will extend as smooth functions for radius < rz + ¢. Consequently Proposition 3.14 will 
apply on each ball to the adjusted functions, and subtraction of the results gives the desired version 
of Green’s formula. 
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The normal vector for the inner sphere points toward the center. Hence we can 
rewrite our equality as 


dv 23 ou 
jee ux do = ites u5~ do. 


Since 2 = c|x|~“-) with c 4 0, we obtain 
or 


—(N-1) _ 4-(N-1) 
€ rjc 40 =t Jixjar 4. 


On the left side, do = €%~'dw, while on the right side, do = t%~'do. 
Therefore 
Sroit u(ew) dw = Sct=t u(tw) da 


whenever 0 < € < ¢ and B, is contained in U. Dividing by Qy_1, letting € 
decrease to 0, and using the continuity of u, we see that u(0) = aoe u(tw) dw. 
Thus uw has the mean-value property. 

For the converse direction suppose initially that u is in C?(U). Define 


m,(u)(x) = Qy!y fj UO + to) do 


whenever x is in U and ¢ is a positive number less than the distance of x to U°. 
With x fixed, the function m;(u)(x) has two continuous derivatives. We shall 


show that i 


d ss 

Fa MW), 9 = N 1 Au(a), (1) 
the derivatives being understood to be one-sided derivatives as t decreases to 0. 
If u is assumed to have the mean-value property, m;(u)(x) is constant in t, and 


a 


we can conclude from (+) that Au(x) = 0. The computation of 75 


m,(u)(x) is 
m,(u)(x) = Qy_, Jigs u(x; +ta,,...,xy + toy) da, 

4 m,(u)(x) = Qi y fajnt Djer OF Djue + to) do, 

£ mi(uy(x) = Vy fyjny Wjeni 01K Dj Deu (x + to) doo. 


Letting t decrease to 0, we obtain 


2: _ 
£m, (u)(x)|, 9 = Ly Vj at Dj Deu) Scapa 7% do. 


If 7 ~ k, then Siot=t @j;@xdq@ = 0 since the integrand is an odd function of 
the j" variable taken over a set symmetric about 0. The integral froi=t oo; dw is 
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independent of j and has the property that N times it is equal to Ji lol? dw = 


Sroi=t dw = Qny_ 1. Thus Sroi=t 0; dw = N~'Qy-_}, and 


o|=1 


£m, (u)(x)|,-9 = N71 Oy D3u(x) = N7!Au(x). 


This proves (+) and completes the argument that a C function in U with the 
mean-value property is harmonic. 

Finally suppose that u has the mean-value property and is assumed to be 
merely continuous. Proposition 3.5e allows us to choose a function g > 0 in 
Cox (RY) with g(x) = go(Ix|), fev G(x) dx = 1, and g(x) = 0 for |x| > 1. Put 
ge(x) = e “ey(e7!x), and define u,(x) = tpn u(x — y)@-(y) dy in the open set 
U, = {x € U | D(x, U‘) > «}. Proposition 3.5c shows that u, is in C°(U,), 
and the mean-value property of u, in combination with the radial nature of @, as 
expressed by the equality y, (tw) = ¢, (te), forces u(x) = u(x) for all x in U;: 


u(x) = [Ss Srot u(x — tw), (tw)t’—| dw dt 
= fp Qv—1u(a)ge (test! dt 
= u(x) few Ge(y) dy = u(x). 


Since ¢ is arbitrary, u is in C°(U). The function u has now been shown to be in 
C*(U), and it is assumed to have the mean-value property. Therefore the previous 
case shows that it is harmonic. 


Corollary 3.17. If uw is harmonic on an open subset U of R™, then w is in 
ce(U). 


PROOF. This follows by using both directions of Theorem 3.16. 


A sequence of functions {u,,} on a locally compact Hausdorff space X is said 
to converge uniformly on compact subsets of X if limu, = u pointwise on X 
and if for each compact subset K of X, the convergence is uniform on K. For 
example the sequence {x”} converges to the 0 function on (0, 1) uniformly on 
compact subsets. 


Corollary 3.18. If {u,,} is a sequence of harmonic functions on an open subset 
U of RN and if {u,} converges uniformly on compact subsets to wu, then w is 
harmonic on U. 


PROOF. About any point of U is a compact neighborhood lying in U, and 
the convergence is uniform on that neighborhood. Therefore u is continuous. 
Each integration needed for the mean-value property occurs on a compact subset 


76 III. Topics in Euclidean Fourier Analysis 


of U, and the uniform convergence allows us to interchange limit and integral. 
Therefore the mean-value property for each u,,, valid because of one direction of 
Theorem 3.16, implies the mean-value property for u. Hence u is harmonic by 
the converse direction of Theorem 3.16. 


Suppose that U is open in R™ and that uw is harmonic on U. If B is an 
open ball in U, then i uAwdx = 0 for all w € C&°_(B) by Green’s formula 


com 
(Proposition 3.14), since yw and ay are both identically 0 on the boundary of B. 
We shall use a smooth partition of unity to show that /, y UAW dx is therefore 0 
for all wy € CS, (U). Corollary 3.19 below provides a converse; we shall use the 
converse in a crucial way in Corollary 3.23 below. 

The argument to construct the partition of unity goes as follows. To each point 
of K = support(y), we can associate an open ball centered at that point whose 
closure is contained in U. As the point varies, these open balls cover K, and 
we extract a finite subcover {U;,..., U,}. Lemma 3.15b of Basic constructs an 
open cover {W,,..., Wx} of K such that wy! is acompact subset of U; for each i. 
Now we argue as in the proof of Proposition 3.14 of Basic. A second application 
of Lemma 3.15b of Basic gives an open cover {Vi,..., Vk} of K such that ye is 
compact and V;! C W; for each i. Proposition 3.5f constructs a smooth function 
gi > 0 that is 1 on yo and is 0 off W;. Then g = aia gi is smooth and > 0 
on RW and is > 0 everywhere on K. A second application of Proposition 3.5f 
produces a smooth function h > 0 on R™ that is 1 on the set where g is 0 and is 0 
on K. Then g+A is everywhere positive on RY , and the functions g; = g;/(¢g+h) 
form the smooth partition of unity that we shall use. 

To apply the partition of unity, we write y = )°; gi. Then each term gj v 
is smooth and compactly supported in an open ball whose closure is contained in 
U. Consequently we have te uA(g;w) dx = 0 for each i. Summing on i, we 
obtain /, y YAY dx = 0, which was what was being asserted. 


Corollary 3.19. Suppose that U is open in RY , that u is continuous on U, and 
that fe uAwdx =0 forall yw e C&_(U). Then u is harmonic on U. 


com 
PROOF. Let B be an open ball of radius r with closure contained in U , fix e > 0 
so as to be < r, and let B, be the open ball of radius r — ¢ with the same center as 
B. Construct g, as in the proof of Theorem 3.16, and let u, = u * g,. Suppose 
that y is in Coo, (B;). Fort and x in RY with |t| < ¢, define (x) = w(t +x). 


Since y is supported in B,, w; is supported in B, and therefore 


ty u(x —t)Aw(x) dx = Jp U@Av(x +1t)dx = 4G uA, dx =0, 


the last equality holding by the hypothesis. Multiplying by ¢, (tf), integrating for 
|t| < €, and interchanging integrals, we obtain 


O= fy few u(x — Dye AW (x) dt dx = f,uc()AW(a) dx. 
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Since y vanishes identically near the boundary of B, this identity and Green’s 
formula (Proposition 3.14) together yield te pW (x)Au,(x) dx = 0 for all w in 
Co _(B-). Application of Corollary 3.6a allows us to extend this conclusion to 
all w in Ccoom(Be), and then the uniqueness in the Riesz Representation Theorem 
shows that we must have Au,(x) = 0 for all x in B,. As é€ decreases to 0, us 
tends to u uniformly on compact sets. By Corollary 3.18, u is harmonic in B. 
Since the ball B is arbitrary in U, u is harmonic in U. 


Corollary 3.20. Let U be a connected open set in R. If u is harmonic in U 
and |u| attains a maximum somewhere in U, then u is constant in U. 


PROOF. Suppose that |u| attains a maximum at x9. Multiplying u by a suitable 
constant e!?, we may assume that u(xo) = M > 0. The subset E of U where 
u(x) equals M is closed and nonempty. It is enough to prove that F is open. Let 
x; be in E, and choose an open ball B centered at x1, say of some radius r > 0, 
that lies in U. We show that B lies in E. For 0 < t <r, Theorem 3.16 says that 
u has the mean-value property 

OR Ssv-i u(x; + tw) dw = u(x,) = M. 


Arguing by contradiction, suppose that u(x; + tomo) ~ u(x,) for some fowo with 
0 < t% <r. Then Reu(x; + too) < M —€ for some € > 0, and continuity 
produces a nonempty open set S in the sphere S%~! such that Reu(x; + tow) < 
M —« for win S. If o is the name of the measure on S‘~!, then we have 


MQy_1 = Re ( fx-1 u(x; + ta) dw) 
= [,Reu(x + tw) dwt fov1_s Reu(x1 + tw) dw 
< f5(M— dwt fovi_s Mdw 
= (M — €)o(S) + Ma(SN“! — S) 
= MQy- — €0(S), 


and we have arrived at a contradiction since 0 (S) > 0. 


Corollary 3.21. Let U be a bounded open subset of R%, and let dU be its 
boundary. If w is harmonic in U and isu is continuous on U“', then sup, ey |u(x)|= 
maxxeay |u(x)|. 


PROOF. Since uw is continuous and U*! is compact, |u| assumes its maximum 
M somewhere on U“'. If |u(xo)| = M for some x in U, then Corollary 3.20 
shows that u is constant on the component of U to which xo belongs. The closure 
of that component cannot equal that component since R* is connected. Thus the 
closure of that component contains a point of dU, and |u| must equal M at that 
point of dU. Consequently sup,<y |u(x)| < maxyeay |u(x)|. Since every point 
of dU is the limit of a sequence of points in U, the reverse inequality is valid as 
well, and the corollary follows. 
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Corollary 3.22 (Liouville). Any bounded harmonic function on R% is 
constant. 


REMARKS. The best-known result of Liouville of this kind is one from complex 
analysis—that a bounded function analytic on all of C is constant. This complex- 
analysis result is actually a consequence of Corollary 3.22 because the real and 
imaginary parts of a bounded analytic function on C are bounded harmonic 
functions on R?. 


PROOF. Suppose that u is harmonic on R% with |u(x)| < M. Let x; and x2 
be distinct points of R“, and let R > 0. Since u has the mean-value property 
over spheres by Theorem 3.16, u equals its average value over balls. Hence 
u(x1) = |BCR; OVI! ficpsy,) Hr) dx and w(x2) = |BCR; ODT" facpeg,) MO) aX. 
Subtraction gives 


u(x1)—U(X2) = [BCR OV" ( farcry MODE — Sa cxy) HO) 4X) 
SBR ON Geena Ode fa nee a) dx): 
Therefore 
|u (x1) — u(x) < [BRS OT! facreeapcrimy MOE, 


where B(R; x1) AB(R; x2) is the symmetric difference (B(R; x1) — BCR; x2)) U 
(BCR; x2) — BCR; x,)). Hence 


M|B(R; x,)AB(R; x)| | MRN|B(1; x1 /R) ABC; x2/R)| 
|B(R; 0)| > RN |B(1; 0)| 


|u (x1) —u(x2)| < 


The right side is |B(1; x1/R)AB(1; x2/R)|, apart from a constant factor, and the 
sets B(1; x1/R)AB(1; x2/R) decrease and have empty intersection as R tends 
to infinity. By complete additivity of Lebesgue measure, the measure of the 
symmetric difference tends to 0. We conclude that u(x,) = u(x2). Therefore u 
is constant. 


In the final two corollaries let Re be the open half space of points (x, f) in 
RN+! such that x is in RY andt > 0. 


Corollary 3.23 (Schwarz Reflection Principle). Suppose that u(x, ft) is har- 
monic in Rots that u is continuous on GS ae and that u(x,0) = 0 for all 
x. Then the definition u(x, —t) = —u(x,t) for tf > O extends u to a harmonic 
function on all of RY*!. 


3. Harmonic Functions 719 


PROOF. Define 


u(x,t) fort > 0, 
—u(x, —t) fort <0. 


w(x,t)= | 


The function w is continuous. We shall show that Jpn wAwdx = 0 for all 
w € C& (R%t!), and then Corollary 3.19 shows that w is harmonic. Write w 
as the sum of functions even and odd in the variable t. Since w is odd in f, the 
contribution to f,, wAw dx from the even part of y is 0. We may thus assume 
that y is odd in ¢. 

For ¢ > 0, let Re = {(x, t) | t > e}. It is enough to show that ee uAw dx dt 
has limit 0 as € decreases to 0 since fant wAw dx dt is twice this limit. We 
apply Green’s formula for a half space (Proposition 3.15) with v = y on the set 
R, C R*! except for one detail: to get the hypothesis of compact support to be 
satisfied, we temporarily multiply yw by a smooth function that is identically 1 for 
t > € and is identically 0 fort < se. Since u is harmonic in R,, the result is that 


— fp uAwdxdt = fp (wAu—uAp)dxdt = fig iiaey (uot — y 8) dx. 


On the right side, lim, Ties, oer uo dx = 0 since u(-, €) tends uniformly 
to 0 on the relevant compact set of x’s in R%. 

Thus it is enough to prove that limejo Secety Hes ye dx = 0. Since w(x, ft) 
is of class C2, is odd in x, and is compactly supported, we have |y(x, t)| < Ct 
uniformly in x for small positive t. Thus it is enough to prove that 


Ou 
li | t—(x, | -—0 
ath a (x,t) (*) 


uniformly on compact subsets of R. 

To prove (x), let g be a function as in Proposition 3.5e, and let g(x, t) = 
e~N+Do(e!(x, t)). Fix xo in R%, and define Xo = (xo, to) and X = (xo, ft). 
If |X — Xo| < ito, then the mean-value property of u in SA gives u(X) = 
(ux Pi 1) )(X). Hence we have 


aX) = a7 Swi Plg(X — Yu(y) aY 
= Janu Z[Gt0)- 8+ (Gt) (X — ¥)) Ju) a. 


In the computation of the partial derivative on the right side, the variable t appears 
as the last coordinate of X. Therefore this expression is equal to 


(500)! fawn to) N+P 22 ((410) (XK — Y))u(Y) a. 
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Changing variables in the integration by a dilation in Y shows that this expression 
is equal also to 


(3t0)"! fawir F2(Gto) |X — Yui) ay. 
If we write Y = (y, s) and take absolute values, we obtain 


24(¢0,8)| <3 42], sup a(P)I. 
|s—to| <2t9/3, 
Y near Xo 


The required behavior of 7 ou follows from this estimate. 


Corollary 3.24. Suppose that u(x, t) is harmonic in RY +! that w is continuous 
on (RY! "and that u(x, 0) = 0 forall. x. If u is bounded, then u is identically 0. 


REMARK. Without the assumption of boundedness, the function u(x, t) = t is 
a counterexample. 


PROOF. Corollary 3.23 shows that u extends to a bounded harmonic function 
on all of R‘+!, and Corollary 3.22 shows that the extended function is constant, 
hence identically 0. 


4. H? Theory 


As was said at the beginning of Section 3, harmonic functions in a half space, 
through their boundary values and the Poisson integral formula, become a tool in 
analysis for working with functions on the Euclidean boundary. The Poisson in- 
tegral formula, which was introduced in Chapters VIII and IX of Basic, generates 
harmonic functions from boundary values. 

The details are as follows. Let Rte be the open half space of pairs (x, f) in 
RN*! with x € R% and witht > Oin R'. We view the boundary { (x, 0) | x eR} 
as IR”. The function 


cnt 


P(x,t)=Pi(x) = (2 = EOD’ 


fort > 0, withcy = ie is called the Poisson kernel for R{Y*'. 
The Poisson integral formula for RYt is u(x,t) = (P,; * f)(x), where f is 
any given function in L? (R”) and 1 < p < oo, and the function u is called the 
Poisson integral of f. 
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If f isin L?, then uw is harmonic on RYT u( -,t) isin L? foreacht > 0, and 
luC- ol, Ifill, For 1 < p < o&, lim;jou(-,t) = f inthe norm topology of 
L? while for p = 00, lim,)9 u(-, +) = f in the weak-star topology of L° against 
L'. Inboth cases, lim, o ||u(-, 1), = Il fll, and lim, jo u(x, t) = f(x) ae.; this 
latter result is known as Fatou’s Theorem. When p = oo, the a.e. convergence 
occurs at any point where f is continuous, and the pointwise convergence is 
uniform on any subset of RY where f is uniformly continuous. 

The L? theory for p = 1 extends from integrable functions to the Banach space 
M(RY) of finite complex Borel measures. Specifically if v is a finite complex 
Borel measure on RN, then the Poisson integral of v is defined to be the function 
u(x,t) = (Pe * “W)(xX) = Jpn P(x — y)dv(y). Then uw is harmonic on Rvt, 
lu(-,¢)|l, < lv] foreach rt > 0, lim,)o u(-, ¢) = v in the weak-star topology of 
M(R™) against Ccom(R™), and limo |lu(-, ||; = lle l- 

The new topic for this section is a converse to the above considerations. For 
1 < p < ©, we define H? (Rt) to be the vector space of functions u(x, t) on 
RY! such that 

(i) u(x, t) is harmonic on RY’*', 

(ii) SUP,.9 lu(-, lp < ov. 
With ||w||,,» defined as sup,.o ||u(-, t) ||p, the vector space HP (RY +) is anormed 
linear space. If f is in L?(IR%), then the facts about the Poisson integral formula 
show that the Poisson integral of f is in H? (RYT) and its H?(RY*) norm 
matches the L?(R”) norm of f. For p = 1, we readily produce further examples. 
Specifically if v is any member of M(R%), then the Poisson integral of v is in 
H' (RY +"), with the H!(RY*') norm matching the M(R") norm. The theorem 
of this section will say that there are no other examples. 

The members of H® (RYT) are exactly the bounded harmonic functions in 
the half space Re, and the tool for obtaining an L© function on R% from 
this harmonic function is the preliminary form of Alaoglu’s Theorem proved in 
Basic:? any norm-bounded sequence in the dual of a separable normed linear 
space has a weak-star convergent subsequence.!° We shall use Corollary 3.24 to 
see that the harmonic function has to be the Poisson integral of this L° function. 


Theorem 3.25. If 1 < p < o, then any harmonic function in H? (RY*) is 
the Poisson integral of a function in L?(R‘). For p = 1, any harmonic function 


in H! a) is the Poisson integral of a finite complex measure in M(R™). 


PROOF. We begin by proving that u(x, tf) is bounded for t > fo. For this step 
we may assume that p < oo. Theorem 3.16 shows that uw has the mean-value 


Theorem 5.58 of Basic. 
'0The full-fledged version of Alaoglu’s Theorem will be stated and proved in Chapter IV. 
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property. We know as a consequence that if B denotes the ball with center (x, t) 
and radius $10. then the value of u at (x, t) equals the average value over B: 


u(x,t) = va J,uQ.s) dy ds. 
Since the measure |B|~! dy ds on B has total mass 1, Hélder’s inequality gives 
Ju(x. 0)? < pa Sy u(y, s)I? dy ds 


|B| YB 


1 
TBI J|s—ti< 41 Sven ey, 9)? dy ds 


lA 


lA 


[(bt)% HQ LN + Dtolle le, 


and the boundedness is proved. 
For each positive integer k, define f(x) = u(x,1/k) and w(x,t) = 
(P, * fx)(x). Then the function w;(x, t) — u(x,t + 1/k) is 
(i) harmonic in (x, t) fort > O since w, and any translate of u are harmonic, 
(ii) bounded as a function of (x,t) for t > 0 since u(x, t + 1/k) is bounded 
fort > 0, according to the previous paragraph, and since wy, is the Poisson 
integral of the bounded function f;, 
(iii) continuous in (x,t) fort > O since u(x, t + 1/k) and wx (x, t) both have 
this property, the latter because fj, is continuous and bounded. 
By Corollary 3.24, wy(x, t) — u(x,t + 1/k) = 0. That is, 


u(x,t+1/k) = few Pi(x — y) fe(y) dy. 


Now suppose p > 1, so that L? is the dual space to L” if p~! + p’~! = 1. 
Since u is in H?, || fxllp < M for the constant M = Il Il4,,- By the preliminary 
form of Alaoglu’s Theorem, there exists a subsequence { f;,} of { fx} that is weak- 
star convergent to some function f in L?. Since for each fixed r, P, isin L'NL® 
and hence is in L?’, each (x, t) has the property that 


u(x,t+1/kj) = fav Px —y) fx, dy > fan Pia — y) fy) dy. 
But u(x,t + 1/kj) > u(x,t) by continuity of wu. We conclude that u(x,t) = 
Jan P(x — y) f(y) dy. 

This proves the theorem for p > 1. If p = 1, the above argument falls short 
of constructing a function f in L! since L! is not the dual of L®. Instead, we 
treat f, as a complex measure f;(x) dx. The norm of f;(x) dx in M(IR%) equals 
\| fx ||,, and thus the norms of the complex measures f;(x) dx are bounded. The 
space M(R%) is the dual of Ccom(R™) and hence also of its uniform closure, 
which is the Banach space Co(R”) of continuous functions on R™ vanishing at 
infinity. Let {f;,(«) dx} be a weak-star convergent subsequence of { f(x) dx}, 
with limit v in M(R%). Since each function y +> P;(x — y) is in Co(R%), we 
have limg py Phx — y) fi;(y) dy = few P(x — y)dv(y). This completes the 
proof. 
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For N = 1, every analytic function in the upper half plane Ri. is automatically 
harmonic, and one can ask for a characterization of the subspace of analytic 
members of +? (R4). Aspects of the corresponding theory are discussed in 
Problems 13—20 at the end of the chapter. 


5. Calderén—Zygmund Theorem 


The Calder6n—Zygmund Theorem asserts the boundedness of certain kinds of 
important operators on L?(R%) for 1 < p < oo. It is an N-dimensional 
generalization of the theorem giving the boundedness of the Hilbert transform, 
which was proved in Chapters VIII and IX of Basic. We state and prove the 
Calderén—Zygmund Theorem in this section, and we give some applications to 
partial differential equations in the next section. 


Theorem 3.26 (Calder6n—Zygmund Theorem). Let K (x) bea C ' function on 
IR” — {0} homogeneous!! of degree 0 with mean value 0 over the unit sphere, 
i.e., with 


/ K(w)dwa=0. 
SN-1 


For each ¢ > 0, define 


repo) = | BM sedi 
| 


t|>e |r| 


whenever | < p < ooand f isin L?(R%). Then 


(a) ||T:fllp < Apllfllp for a constant A, independent of ¢ and f, 
(b) aa T; f = Tf exists as an L? limit, 
E 


(c) |ITfllp < Apll fllp for a constant A, independent of /. 


REMARKS. If 1 < p < oo and if p’ is the dual index to p, then the function 
equal to K (t)/|t|‘ for |t| > ¢ and equal to 0 for |t| < ¢ isin L?’. Therefore, for 
each such p, T; f is the convolution of an L?’ function and an L? function and is 
a well-defined bounded uniformly continuous function. In proving the theorem, 
we shall use less about K (x) than the assumed C! condition on R — {0} but more 
than continuity. The precise condition that we shall use is that |K (x) — K (y)| < 
w(\x—y]|) on S\—! fora nondecreasing function w (65) of one variable that satisfies 
iG w(5) 


'lA function F of several variables is homogeneous of degree m if F(rx) = r™ F(x) for all 
r >Oandallx 40. 
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The main steps in the proof are to show that the operator T; equal to T, fore = 1 
is bounded on L? and is of weak-type (1, 1) in the sense that | {x | (7; f(x) > é} | 
< Cl\lfll:/&. The remainder of the argument is qualitatively similar to the 
argument with the Hilbert transform, not really involving any new ideas. We 
handle matters in the following order: First we prove as Lemma 3.27 two facts 
needed in the L” analysis, second we give the proof of the boundedness of 7; 
on L?, third we establish in Lemmas 3.28 and 3.29 a weak-type (1, 1) result 
for a wide class of operators, and fourth we show as a special case that T; is of 


weak-type (1, 1). Finally we tend to the remaining details of the proof. 


Lemma 3.27. There is a constant C such that for all R > 1, all ¢ with 
0 < e < 1, and all nonzero real a and b, 


® sin ar dr 
@|f =““\<c, 
i r 


® (cosar — cos br) dr 
o|f 


r 


| < C(I + | log(1a/b))). 


PROOF. In (a) and (b), the signs of a and b make no difference, and we may 
therefore assume that a > 0 and b > 0. 


In (a), the change of variables s = ar converts the integral into ee sinsds 
1 


sin s is integrable near 0, it is enough to consider f, ““*“. Integration 
[tacos 7 S (coss—1) ds 
s 0 0 s2 


Since s— 


by parts shows that this integral equals | — . The integrated 
term tends to a finite limit as S tends to infinity, and the integral is absolutely 
convergent. Hence (a) follows. 

In (b), possibly by interchanging a and b, we may assume that c = b/a is < 1. 


The change of variables s = ar converts the integral into vias kos s=c08es) dS Since 


AY 
|1—coss| < 58° for all s, we have |1 —coscs| < 5c7s? < $87. So the integrand 
is < s inabsolute value everywhere and in particular is integrable for s near 0. It is 
therefore enough to show that | J : (eos soos ead | < C(1+log(c7!)). Integration 
cossds ~ ean \ S sins ds 


s s dt ' Jl 3? 
finite limit, and the integral is absolutely convergent. Hence the term /, : seeds 


is bounded, and it is enough to handle fj ‘a SOS esas. 7 ds 


integral to Ee cost dt If cS > 1, the integral from 1 to cS contributes a bounded 
amount, as is seen by integrating by parts, and the integral from c to | contributes 


in absolute value at most fc a 


by parts gives / : . The integrated term tends to a 


. Putting t = cs changes this 


= logce™!. If cS < 1, the integral from c to cS 


contributes in absolute value at most fe ay ihe # = loge! + log(cS)"! < 


2loges!. 


PROOF FOR THEOREM 3.26 THAT 7; IS BOUNDED ON L”. Define k(x) to be 
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K(x)/\x|% for |x| > 1 and to be O for |x| < 1. Then k is an L? function, and 
T, f = k* f. We show that 7; is bounded on L” by showing that the Fourier 
transform Fk of k is an L® function. 

If 7, denotes the indicator function of {|x| < n}, then the sequence {kI,} 
converges to k in L?. By the Plancherel formula, {F(kJ,)} converges to Fk in 
L?. Thus a subsequence converges almost everywhere. To simplify the notation, 
let n run through the indices of the subsequence. We have just shown that 


(Fk)(x) = lim, Ji k(x)e27Y dx, 


x|<n 


the limit existing almost everywhere. Write x = rw and y = r’w’, wherer = |x| 
and r’ = |y|. Thenx - y =rr’cosy, where y = w- a’, and (Fk)(x) is the limit 
on n of 


ee ig Kw) eo 2nirr cosy ,N—I1 dr dw 
n —2nirr! cosy 
= favs [ J 2] Kw) do 


_ n (e~27t11" C08 Y _ cos Iarr’) dr 5 
= fev [ 1 - |K(@) do since K has 
mean value 0 


_ Fees Ee ( cos(2zrr’ cos y)—cos anrr’) “1K@) ai 


r 


F sin(2mrr’ cosy) d 
ae ees [ i sin( Err cosy) “|K (w) in: 


Let us call the terms on the right side Term I and —i Term II. The inner integral 
for Term II is bounded independently of r, r’, y, n by Lemma 3.27a. Since K is 
bounded, Term II is bounded. 

The inner integral for Term I is bounded by C (1 + log(| cos y | rays according 
to Lemma 3.27b. Since K is bounded, the contribution from C by itself yields a 
bounded contribution to Term I and is harmless. We are left with a term that in 
absolute value is 


<C fv log(|cos y|7!)|K (@)| do = C fey-1 log(| cos(w - w’)|7!)|K (@)| do. 


Since K is bounded, it is enough to estimate Ses log(| cos(w - w')|~!) dw. This 
integral is independent of w’. We introduce spherical coordinates 


@, = COs 0), 


@2 = sin 0; cos 62, 
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and take w’ = (1,0,...,0). The integral becomes 


-S log(| cos 0;|~!) sin” ~? 6; --- sin @y_2 dOy_1--- dO, 
0<0;<z for j<N—1, 
0<O0y_1)<2n 


which is a constant times tes log(|cos 6|~!) sin’~* 6 d@. This integral in turn 
is < iP log(| cos @|~!) d@, whose finiteness reduces to the local integrability of 
log(|x|~!) on the line. Thus TermI is bounded, and the boundedness of Fk 
follows. 


Lemma 3.28 (Calder6én—Zygmund decomposition). Let f be in L'(R), and 
let € be a positive real number. Then there exists a finite or infinite disjoint 
sequence {F,,},>1 of Borel subsets of IR such that 


(a) for each E,,, there exists a ball B, = B(r,; x,) such that the balls B, and 
By = B(Srn; Xn) have B, C E, © Br, 

(6) 2, LEnl=S" Fife: 

(c) | f(x)| < € almost everywhere off U),, En, 


(d) | |f(y)| dy < 5%é for each n. 
En 


S 
® 


FIGURE 3.2. Calder6n—Zygmund decomposition of RY relative to a function at a 
certain height. The set where the maximal function of f exceeds & lies in the 
union of the gray balls. The gray balls have radii 5 times those of the black 
balls, and the black balls are disjoint. The function | f| is < € almost 
everywhere off the union of the gray balls, and the sum of the volumes 
of the gray balls is controlled. 


© 


REMARKS. In the 1-dimensional case, this result was embedded in the proof 
of Theorem 8.25 of Basic. The sets E,, were open intervals. Extending that 
argument too literally to the N-dimensional case is unnecessarily complicated 
for current purposes. Instead, we settle for an n" set that contains a ball of some 
radius about a point and is contained in a ball of 5 times that radius. Thus the n™ 
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set E,, consists of a black ball and part of the corresponding gray ball in Figure 
3.2. The fact that E,, has not been precisely located makes the proof of weak-type 
(1, 1) in the present section more difficult than the proof of Theorem 8.25 of 
Basic. 


PROOF. Let f* be the Hardy—Littlewood maximal function 
FP) = Supp <p coo [BOS IT! Seay LF OI AY, 


and let E = {x | f*(x) > &}. If x isin E, then |B(r; x)|7! Tse IfQ)|dy >& 
for some r > 0. On the other hand, lim,_,.. |B(r; x)|7! frees, Ifo)|dy = 0 
since f is integrable. Thus, for each x in E, there exists anr = r, depending on 
x such that 


BOs OI" fag in IFO dy > & 
and BOSrx: x)" focsy cm LE Ody SE. 


Since [If ll = Sp.n IF O)Idy > 1B Gx: x)| = r¥ €|B(L; 0)], the radii r, are 
bounded. Applying the Wiener Covering Lemma! to the cover {B(r,; x) | x € E} 


of E, we obtain a finite or infinite sequence of points x;,x2,... such that the 
balls B(r,,; Xn) are disjoint and 
BOM) Bras xn): (*) 


Write r, for r,,. Put Ey = B(5rj; x1) — Ujei B(r;; x;), and define inductively 


En = BOSrn3 Xn) — eS, Ej ~ Ujgn Bry; Xj). 
By inspection 
(i) the sets E,, are disjoint, 

Gi) Bry; xn) C En © BOSrn; Xn) for eachn, 

Git) Be = BOP ay). 
Property (ii) immediately yields (a). The second inclusion of (11) gives €|E,| < 
£|B(5rn; Xn)| = SYEIBOn Xn) <5" Sgcx,)|FQ)|dy. Summing on n and 
taking into account the disjointness of the sets B(rn; Xn), we obtain € )>,, |En| < 
57 Sei wees If)|dy < 5% || fll. This proves (b). The two inclusions 


of (ii) together yield J, IfO)IdY < fagsrjix,) |FOIdY < €1BGrns xn)| = 
5NE| Brn; Xn)| < SNE|E,|, and this proves (d). Finally («) and (iii) together 
show that E C |), En. Therefore f*(x) < & everywhere off U,, En. Since 


lim, jo IBOs X)I7" fogexy IF Ody = f@) 


almost everywhere on RY , we see that | f (x)| < € almost everywhere off Jy Bigs 
This proves (c). 


121 emma 6.41 of Basic. 
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Lemma 3.29. Let k be in L?(IR"), and define Tf = k * f for f in L! + L?. 
If 


(a) IIT fll, = All fll, and 
(b) there exist constants B and a > 0 such that 


/ k(x —y) —k(x)|dx < B 
xlzalyI 


independently of y, 


then the operator T is of weak-type (1, 1) with a constant depending only on A, 
B,a,and N. 


PROOF. We are to estimate the measure of the set of x where |(T f)(x)| > &. 
Fix f and &, and apply Lemma 3.28 to obtain disjoint Borel sets E,, and balls 
Bn = B(rni Xn) and BY = B(5rn; Xn) with B, C E, © BF and with the other 
properties listed in the lemma. Now that the sets E,, have been determined, we 
decompose f into the sum f = g +b of a “good” function and a “bad” function 
by 


eee f(y) dy for x € Ey, 
gx) = 


f(@) forx ¢U,, En. 
| f@)- pyle, fOdy  forxe E,, 
b(x) — n n 
0 forx ¢U,, En. 


Since {x | |Tf(x)| > €} ¢ {x | |Tg(x)| > €/2} U {x | |Tb@)| > &/2}, it is 
enough to prove 
(i) |{x | IT g(x)| > €/2}| < CIF Il, /€ and 
(ii) |{x | |Tb@)| > €/2}| < CIFI,/E 
for some constant C independent of & and f. 
The definition of g shows that dis lg(x)|dx < ie | f(x)|dx for all n and 


that |g(x)| = | f(x)| for x ¢ U,, En; therefore fay |g(x)| dx < fev If (x) dx. 
Also, properties (b) and (c) of the E,,’s show that |g(x)| < 5& a.e. These two 
inequalities, together with the bound ||Tg||, < Allg|l,, give 


Sew ITg (x)? dx < A* fen lex)? dx 
<5SNEA* fan Ig(x)| dx < SNEA? fon |f OI dx. 


Combining this result with Chebyshev’s inequality 


el (PG SB} = 67 feel @)rax 
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for the function F = Tg and the number 6 = &/2, we obtain 
4-5" AI fll, 


4 
{x | IT g(x)| > €/2}| <aoew [ | f (x)| dx = 
E RN gE 


This proves (i). 

For the function b, let b, be the product of b with the indicator function of 
E,. Then we have b = }°,b, with the sum convergent in LS Inspection of 
the definition shows that ||b, ||, < 2 fe | f ()| dy, and therefore ||b||, < 2\| fl,. 
Since T is convolution by the L? function k and since b = >, b, in L', Tb = 
>, bn with the sum convergent in L?. A subsequence of partial sums therefore 
converges almost everywhere. Inserting absolute values consistently with the 
subsequence and then inserting absolute values around each term, we see that 


ITb(x)| < YO, |Tbn()| a.e. 
Let a be the constant in hypothesis (b). The measure of Ku Bary; Xn) iS 
|, Barn; Xn)| < Cy |BGarn; Xn)! = Ly NON |BOns Xn) 
25% a" 2, El Soe Filla Ee 


Let X = R” — U,, BSarn; Xn). If we show that ‘fy |Tb(x)| dx < C’|| f |l1, then 
we will have 


{x | |Tb@)| > &/2}| < 6% a™ + 2C)IF Il, /é, (*) 


and (ii) will be proved. Put t,(X) = {x — x, | x € X}. Since ie b(y) dy = 0 
for each n, 


Jy |TO@)| dx < Voy Sy |Tbn(a)| ax 
=P, Sx | Se, RO — yb) dy| ax 
=>, Sy | Sg, He — y) — ke = xn) 1O(y) dy| dx 
< Date Se, @ — y) —k@ — xn)IIBO)| dy dx 
Se, [fay K@ +40 — ¥) — kK) dx] ]bO)| dy 
<= Dn Se, [i pcarsoye eC +n — y) — k(x) dx] |bO) | dy. 


In the n™ term on the right side, y is in E, © By, and hence |x, — y| < 5rn; 
meanwhile, |x| > S5ar,. Therefore |x| > Sar, > a|x, — y|. The right side in the 
display is not decreased by increasing the region of integration in the x variable, 
and hence the right side is 


<n Se, [isteatx—yi KG +n — y) — k@)| dx]]bO)| dy 
<>, Se, Ble) dy = Bibl, < 2BIlf lh- 
Therefore (*) is proved with C’ = 2B, and the proof of (ii) is complete. 
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PROOF FOR THEOREM 3.26 THAT T; IS OF WEAK-TYPE (1, 1). With k(x) taken 
to be K(x)/|x|% for |x| > 1 and to be 0 for |x| < 1, Lemma 3.29 shows that it 
is enough to prove that 


Srci=2iy1 Ikx —y)—k(x)|\dx < B (+) 


with B independent of y. The function k is bounded, and thus the contribution 
to the integral in (*) from the bounded set of x’s where |x| < 1 is bounded 
independently of y. The set of x’s where |x — y| < 1 is a ball whose measure is 
bounded as a function of y, and thus this set too contributes a bounded term to 
the integral in («). It is therefore enough to prove that 


/ K(w—y)  K@) 


Ix—ylY [x|% 


dx 


|x|>2Iyl, 
|x—y|21, |x]21 


is bounded as a function of y. If M is an upper bound for | K |, then this expression 
is 


<f\K@ —y)Il gaye — pele tf SSSA dx 


[|X 
1 1 K(x-y)—-K(x 
<M f leo Fig jdx+ ff IK nu OW ay, (+k) 
|x|>2|y|, |x|>2Iy|, 
|x|=1 |x|>1 


We use the two estimates 
1 3 
Ix — y| < lx] + ly] S [x] + 5 [x] = §lr| 
1 1 1 
and Ix — y| = |x] — lyl = Gel — ly) + gle] = gle. 


The integrand in the first term of (+) is equal to 


1 1 | [et | N| iaifed ed 
— <2 
|x—y|% |x|% |x|V|x—y|¥ 1 = [x|2% 
<2N | x|=|x—yl [alot + |e |N? e—y let le—y No) 
= [x [28 
N lyl(x|8 7! +) |N? xy te +la—ylN 1) N3\N lylQalN ot +l N+ tla | 1) 
< < 2 
aaa 2 x |24 — 2 (5) |x| 
N_lyl 
= N3" hn 


Thus the integral in the first term of (+) is 


< N3* jf Ly dx = N3%Qw-1 f~ bl rN—| dr 


x|2max{1,2[y]} |x|V7 max{1,2|y|} rN+! 


= N ly| 1 N 
= N3 Qn-latiIp = 2N3 Qw-1, 
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and this is bounded independently of y. 
For the second term of (**), we start from the estimate 


d w |z—w| 
lia — trl S ntienta H) 
To verify (+), we may assume that |z| > |w|. Then a +12 ~ because 
the left side is > 2 and the right side is < 2. Multiplying by a — 1, we obtain 
2 Zz . . 
a 1> ae ae Hence 1 a t1< EL ior + 1, which is the 
square of (+). 


Using (+) and the definition and monotonicity of the function y that is defined 
in the remarks with the theorem and that captures the smoothness of K , we have 


[Ka -y)—K@)| = KG x KGpI|<s = vist =I — al) < ¥( Sapam): 


D; , 
Since |x — y| > 4 |x], min{|x — y], |x|} = 4 |x]. Thus W( sam rea ay) < ¥( at), 
and the computation 


[K@-y)—K@)| WClyl/x)) wd/le) 
ae ee Sf iy aX fo Spee dz 
el>2Iyh el>2IyL iel=1, 
[>I [>I \cl= 1/211 


= Qy-1 hae tipi VOsO ldr 
Oi 4 ent ees! dé 
< Qy-1 i y(5)d-' dé 


shows that the second term of (+) is bounded independently of y. 


PROOF OF REMAINDER OF THEOREM 3.26. We can now argue in the same way 
that the Hilbert transform was handled in Chapter IX of Basic. Since T; has been 
shown to be bounded on L? and to be of weak-type (1, 1), the Marcinkiewicz 
Interpolation Theorem given in Theorem 9.20 of Basic shows that ||T) f|l, < 
Api fll 5 for | < p < 2 with A, independent of f. Lemma 9.22 of Basic extends 
this conclusion to 1 < p < oo. The argument that proves Theorem 9.23a in 
Basic applies here and shows that ||T; f||p < Apll fll, for 1 < p < © with A, 
independent of f and e. This proves Theorem 3.26a. 

The same argument as in Lemma 9.24 of Basic shows that if f is aC! function 
of compact support on R™, then 
exists uniformly and in L? for every p > 1. This proves (b) of Theorem 3.26 for 
the dense set of C! functions f of compact support. 
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To prove the norm convergence when we are given a general f in L? with 
1 < p < &, we choose a sequence f, in the dense set with f, — f in L?. Then 


IT: f — Te fllp <lITeCf — fudllp + I Tefa — Te fully + ITe(fa — Ply 
< Apll fn — fllp + Te fn — Te fallp + Apll fn — fly: 


Choose n to make the first and third terms small on the right, and then choose ¢ 
and ¢’ sufficiently close to 0 so that the second term on the right is small. The 
result is that T,, f is Cauchy in L? along any sequence {é,} tending to 0. This 
proves Theorem 3.26b. 

For any f in L? with 1 < p < o, we have just seen that T, f > Tf in L?. 
Then (a) gives || Tf, = limeo [IZ fll < lim sup, jo Apll fl, = Apll fll,- This 
proves Theorem 3.26c. 


6. Applications of the Calder6én—Zygmund Theorem 


EXAMPLE 1. Riesz transforms. These are a more immediate N-dimensional 
analog of the Hilbert transform than is the operator in the Calder6n—Zygmund 
Theorem. In R!, the Poisson kernel and conjugate Poisson kernel are given by 


y 1 x 


1 
Psy) =P y= and O(x, y) = Qy(x) = ery: 


mw x2+4 y2 
The conjugate Poisson kernel Q may be obtained starting from the Poisson kernel 
P by applying the Cauchy—Riemann equations in the form 
dP dad 
— = — and 
ox dy ox dy 
and by requiring that Q vanish at infinity. The differential equations lead to the 
solution 


The Hilbert transform kernel may be obtained by letting y decrease to0 in Q(x, y). 
The resulting formal convolution formula 


HfGy= -| te) 


is to be interpreted in such a way as to represent passage from the boundary values 
of P, * f to the boundary values of Q, * f. We know that a valid way of arriving 
at this interpretation is to take the integral for |t| > ¢ and let ¢ decrease to 0. 
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In N dimensions the Poisson kernel for Re is 
cnt 


nr Or eaere xe Rr’; t> 0, 
(|x? + 223040 


POX, H=PaX)= 
withcy = az 2N+Dp (). If we write x +, in place of t, the natural extension 
of the Cauchy—Riemann equations is the system for the (NV + 1)-component 
function u = (u1,...,Uy41) given by 


divu =0 and curlu = 0, 
N+1 
é Ou; Ou; Ou, 
1.e., =0 and = when i j. 
ss OX; Ox; OX; a : 
A solution is (Q;,..., Oy, P), where 
CNX; 


xeER ¢>0. 


j it SS aay evar 
Qj (x ) (|x| Ea 12)2(N+1) 


Imitating the procedure summarized above for the Hilbert transform, we let ¢ 
decrease to 0 here and arrive at the kernel 

CNX; 

| x|Nt1 7 


Accordingly, we define the j" Riesz transform for 1 < j < N by 


. Jj 
R; f(x) =cylim f(x — y)dy. 
J N £10 Jiyise |y|N+1 


The Calder6én—Zygmund Theorem (Theorem 3.26) shows that R; is a bounded 
operator on L?(IR%) for 1 < p < oo. The multiplier on the Fourier transform 
side can be obtained routinely from the formula for the Fourier transform of 
P,(x), namely P,(y) =e *'lyl, by using the differential equations and letting f 
decrease to 0. The result is 


— ix; ~ 
Rif(y) = Tal f(y). 

A sample application of the Riesz transforms is to an inequality asserting 
that the Laplacian controls all mixed second derivatives for smooth functions of 
compact support: 

| aa 


o| < ApllAgIl, for 1 < p < oo and ge C® (RY). 
Ox; OX; Pp 


com 
The argument works as well for all Schwartz functions g: the partial derivatives 
satisfy the identity  — gy = — Rj; Rx Ag because the equality 


Bay Ox 


Pa tyj 1Yk a 
—4x7yn GO) = —(— FI) (- F479 GO 
shows that the Fourier transforms are equal. 
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EXAMPLE 2. Beltrami equation. This will be an application in which the L? 
theory of the Calder6n—Zygmund Theorem is essential for some p 4 2. We deal 
with functions on R?. Define 


0 ljdo .a 0 ljd (a 
= ( i ) and -= ( + I ): 
dz 2\d0x dy az 2\dx dy 
We shall use the abbreviations f, = of and f; = or The Cauchy—Riemann 
equations, testing whether a complex-valued function on R? is analytic, become 
the single equation f; = 0. 
We shall use weak derivatives on R? in the sense of Section 2. Let jz be in 
L®(R?) with l\loo = & < 1. In the sense of weak derivatives, the Beltrami 


equation is 


fz = Az. 


This equation is fundamental in dealing with Riemann surfaces, since solutions 
to it provide “quasiconformal mappings” with certain properties. For simplicity 
we assume that jz has compact support. We seek a solution f such that f(0) = 0 
and f, — 1 isin some L? class. 

The equation is solved by first putting it in another form. Let 


Ph(c) = - | ( : ~)a(z) dx dy. 
Ww Jp2 \Z—-¢ Z 


The factor in parentheses is in L7(R*) for 1 < q < 2, and Hélder’s inequality 

shows that Ph is therefore well defined for h in L?(R7) if p > 2. In fact, one 
2 

can show that |Ph(g1) — Ph(é2)| < CllAll,lé1 — als and therefore Ph is 

continuous for such h. Observe that Ph(O) = O for all h. Also, one can show 

that 


(Ph); =h in the sense of weak derivatives. (*) 


However, the definition of P falls apart for p = 2. Now define 


. 1 h(z) 
Th = lim —— dx dy. 
uD a Gan 


The operator T is bounded on L? (IR*) for 1 < p < oo by the Calder6n—Zygmund 
Theorem, and we shall be interested in / as above, thus interested in p > 2. One 
can show that 


(Ph), =Th in the sense of weak derivatives if h € L? with p > 2. (x) 


Now we can transform the Beltrami equation. Suppose that f is a weak solution 
of the Beltrami equation with f (0) = 0 and f, — 1 in L? for some p with p > 2. 
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Since yz is in L®, wf, — w is in L?, and since yz has compact support, wf, is in 
L?. Then f; = wf, isin L?, and P(f;) is defined. The function f — P(fz) is 
analytic because () shows that Af — P(f:)) = fz — f; = 0. One can easily 
show that this analytic function has to be z, i.e., that 


f= POs: 


Differentiating with respect to z and using (*«), we obtain f, = T(f;) +1 = 
T (ufz) + 1. The equation 
f,=TWwf.) +1 (H) 


is the transformed equation. 

Assuming that f is a solution of the Beltrami equation and therefore of (7), 
we shall manipulate (+) a little and arrive at a formula for f. Multiply (+) by uw 
and apply T to get T(uf,) = TuT uf, + Tw. Adding | and substituting from 
(+) gives 

fo =TuTpf,+Tutl. 


Iteration of this procedure yields 


fo= (Tu) fp t+ tTut---+ (Ty) 4). 


We want to arrange that the first term on the right side tends to O in the limit 
on n. The operations of P and T have together made sense only on L? for 
p > 2. The linear operator g +> yg on L? has norm ||||,, = k < 1, and T 
has norm A,, say. It can be shown that T is unitary on L”, so that Ay = 1. The 
Marcinkiewicz Interpolation Theorem does not reveal good limiting behavior for 
the bounds of operators at the endpoints of an interval of p’s where it is applied, 
but the Riesz Convexity Theorem!* does. Consequently we can conclude that 
lim sup,» Ap = 1. Therefore the operator g +> Tg, with norm < kA, on L? 
for p > 2, has norm < 1 if p is sufficiently close to 2 (but is greater than 2). Fix 
such a p. Then we have 


Tu)" fellp <TH" TNT 1felly — 0. 


and 
f= lim[1 + Tete +p) 4). 


The function f,—1 = lim,[Tut+---+(T)""'Jis certainly in L?. As asolution 
of the Beltrami equation, f has fe = wf, = w+ mlim,[Tu+---+(Ty)"|]. 


'5The Riesz Convexity Theorem uses complex analysis. It was stated in Chapter IX of Basic, 
but the proof was omitted. 
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We saw above that any solution f of the Beltrami equation with f (0) and with 
ft; — 1 in L? has to satisfy f = P(f;) +z. Thus our formula for f is 


f = P(wtmlim[Ty +--- + (Tp)""']) +2. 


Finally we can turn things around and check that this process actually gives a 
solution. Define g = w+ylim,[Tw+---+(T)"!Jin L?,andput f = Pg+z. 
Application of (*«) and («*) gives f; = g and f, = Tg + 1. Substitution of the 
formula for g into these yields 


fe = wt wlim[ Tp +--+ (Tay) = eC + lim[ Tp +--+ TH)"'D) 


= e+ Time + pT t-- + wp") =u +7) = wf, 


as required. The equality f, = Tg + 1 shows that f, — 1 is in L”, and the fact 
that Ph(O) = 0 for all h shows that f(0) = (Pg +z)(0) = 0. 


7. Multiple Fourier Series 


Fourier series in several variables are a handy tool for local problems with linear 
differential equations. One isolates a problem in a bounded subset of R% and 
then reproduces it periodically in each variable, using a large period. Multiple 
Fourier series for potentially rough functions is a complicated subject, but we have 
no need for it. What is required is information about Fourier series of smooth 
functions. The relevant theory is presented in this section, using 27 for the period 
in each variable, and a relatively simple application is given in the next section. 
A more decisive application appears in Chapter VII, where we establish local 
solvability of linear partial differential equations with constant coefficients. 

If f is a locally integrable function on RN that is periodic of period 27 in each 
variable, its multiple Fourier series is given by 


fO~ lac, 
k 
the sum being over all integer N-tuples and the coefficients c, being given by 


Ce = (20) fe ae ie fet dx, 


—T 


Let us write Z’ for the set of all integer N-tuples and [—z,, 27] for the region of 
integration. Such series have the following properties. 
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Proposition 3.30. If f is a locally integrable function on R™ that is periodic 
of period 27 in each variable, then 
(a) |cx| < || f ||, relative to L'((—a, 2%, (22)~% dx), 
(b) |cx| < Cu|k|~™ for every positive integer M if f is smooth, 
(c) rez cxe** is smooth and periodic if |cx| < Cu|k|~™ for every 
positive integer M, 
(d) {e'**},<gn is an orthonormal basis of L*([—z, 2], (22)~*% dx), 
(e) f(x) = Vyegn cxe’** if f is smooth. 
PROOF. Conclusion (a) is evident by inspection of the definition. For (b), 
integration by parts shows that any C! periodic function f has the property that 


(kj) Sema Taj ar= S-mx D,f (e** dx. 


Apart from the factor of (27r)~, the right side is a Fourier coefficient, and its 
size is controlled by (a). Iterating this formula, we see, in the case that f is 
smooth, that the Fourier coefficients c, of f have the property that {P (k)cx}xrezn 
is bounded for every polynomial P. Then (b) follows. 

Conclusion (c) is immediate from the standard theorem about interchanging 
sums and derivatives. The result (d) is known in the 1-dimensional case, and the 
N-dimensional case then follows from Proposition 12.9 of Basic. In (e), the series 
converges to f in L? as aconsequence of (d), and hence a subsequence converges 
almost everywhere to f. On the other hand, the series converges uniformly to 
something smooth by (c). The smooth limit must be almost everywhere equal to 
f and it must equal f since f is smooth. 


8. Application to Traces of Integral Operators 


We return to the topic of traces of linear operators on Hilbert spaces, which was 
introduced in Section II.5. That section defined trace-class operators as a subset 
of the compact operators, and the trace of such an operator L is then given by 
>0;(Lu;, uj), where {u;} is an orthonormal basis. The defining condition for 
trace class was hard to check, but Proposition 2.9 gave a sufficient condition: if 
L:V — V isbounded and if >7; ; |(Lui, vj)| < oo for some orthonormal bases 
{u;} and {v;}, then L is of trace class. 

In this section we use multiple Fourier series to show how traces can be 
computed for simple integral operators in a Euclidean setting. The setting for 
realistic applications is to be a compact smooth manifold. Such manifolds are 
introduced in Chapter VIII, and the present result is to be regarded as the main 
step toward a theorem about traces of integral operators on smooth manifolds.!+ 


'4Traces of integral operators play a role in the representation theory of noncompact locally com- 
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Proposition 3.31. Let K(-,-) be a complex-valued smooth function on 
R% x R% that is periodic of period 27 in each of the 2N variables, and suppose 
that the subset of [—z, 2] x [—x,2]" where K is nonzero is contained in 
eee ea aoe ]’. Define a bounded linear operator L on the Hilbert space 
1 =a" Gry % dx) by 


1 
Lf(x) = oat sy K(x, y) f(y) dy. 


Then L is of trace class, and its trace is given by 


TrL 


= K(x,x) dx. 
Cv) sea re 


PROOF. For each k in Z, the effect of L on the function x  e!** is 


L(e*)@)= (27r)N | 1” Rea 


Taking the inner product in L?({—z, 2], (277)~% dx) with x + e!!* gives 


1 ba ie Ete: 
(Le),2O) = (nyin I SES EX dy dx. (x) 


The right side is a multiple-Fourier-series coefficient of the function K, and it is 
estimated by Proposition 3.30b. Proposition 3.30c shows that the corresponding 
trigonometric series converges absolutely. The functions e!“* are an orthonormal 
basis of L?({[—m, 2]", (207)~% dx) asa consequence of Proposition 3.30d, and 
therefore the sufficient condition of Proposition 2.9 is met for L to be of trace 
class. 

To compute the trace, we start from (*) with k = 1. We change variables, 
letting u = y — x and v = y + x, and the right side of («) becomes 


1 a 
ee IL yn PKG. 400 +) dy 
[-m,71] 


because of the small support of K. We sum on k in Z", moving the sum 
under the integration with respect to v and recognizing the sum inside as the 
sum of the multiple-Fourier-series coefficients in the u variable, i.e., the sum 


pact groups and in index theory. Both these topics are beyond the scope of this book. Consequently 
Chapter VIII does not carry out the easy argument to extend the Euclidean result to compact smooth 
manifolds. 
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of the series at the origin. Since the functions e’*“ are an orthonormal basis of 
L?({—, 2]", (27)—" dx), the sum of the uniformly convergent multiple Fourier 


series has to be the function itself. Thus we find that 


1 
TrL =——. / K( 
(470) (—2,7]% 


Replacing SU by v and again taking into account the small support of K, we 
obtain the formula asserted. 


NI- 


9. Problems 


1. Check that (1 + 47*|y|*)~!g is in the Schwartz space S if g is in S, so that 
(1 — A)u = f is solvable in Sif f isin S. 

2. Show that the Schwartz space S is closed under pointwise product and convolu- 
tion, and show that these operations are continuous from S x S into S. 


3. If Q is the open unit disk in R7, prove the following: 

(a) The function (x, y) > log (Ge + yy) is in Li (Q) for 1 < p < 2 butis 
not in L7(Q), 
(b) The unbounded function (x, y) > log log ((x? + y*)) is in Li(Q). 

4. Let Q be a nonempty bounded open set in R”, and suppose that there exists a 
real-valued C! function h on R" such that h is positive on Q, h is negative on 
(Q°!)°, and the first partial derivatives of h do not simultaneously vanish at any 
point of the boundary Q°' — Q. Prove that Q satisfies the cone condition of 
Section 2. 

Problems 5—7 compute explicitly the Fourier transforms of the members of a family 

of tempered distributions. 

5. Show that the function |x|~“—” on R* is a tempered distribution if0 < a < N. 
For what values of @ is it the sum of an L! function and an L* function? 


6. Verify the identity [5° r?-!e-*"" dt = [5° Pet H"/" dt =P(B) (at |x/2)-F. 
7. Let y be in S(R). Taking the formula Fe?!) = t-N/2e-*h1"/" as known 
and applying the multiplication formula, obtain the identity 


Fan eV Ga) dx = ON? fay CEM (x) dx. 


Multiply both sides by 12N-2)-1 and integrate in t. Dropping dx from the 
notation for tempered distributions that are given by functions, conclude from 
the resulting formula that 
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a7 3N+er(5(N — a) 
T(5a) 


Ps 


F(lx|) = 


as tempered distributions if0 <a < N. 


Problems 8-12 introduce a family H’ = H*(R) of Hilbert spaces for s real. This 
is another family of spaces called Sobolev spaces. The space H* consists of all 
tempered distributions T € S(R”) whose Fourier transforms F(T) are locally square 
integrable functions such that |F(T)|°( + |&|7)° dé is finite, the norm ||T'|| ;,, 
being the square root of this expression. The spaces H* get larger as s decreases. 

8. Lets > 0 be an integer, and let T be a tempered distribution. 

(a) Prove that if T is in H’, then all distributions D°T with |a| < s are L? 
functions. In this situation, if T is the L? function f, conclude that f is in 
LR), 

(b) Prove conversely that if D“T is given by an L? function whenever la| <s, 
then T isin H®. 

(c) As aconsequence of (a) and (b), H* can be identified with F7(R* yifs >0 
is an integer. Prove that the respective norms are bounded above and below 
by constant multiples of each other. 


9. (a) Prove for each s that the operator A,(T) = Fo! ((1 + |&|?)*/7F(L)) is a 
linear isometry of H* onto H® = L?, and conclude that the inner-product 
space H° is a Hilbert space. 

(b) Prove that Act carries the subspace S(R”) of Schwartz functions, i.e., 
tempered distributions of the form T, with y € S(R™), onto itself. 
(c) Prove that S(R”) is dense in H’ for all s. 
10. Suppose that 7 is in H~’ and 9 is in S(R”) C AH. Prove that |(T,¢)| < 
IP Wags UG ll gs 
11. Conversely suppose that s is real and that T is a tempered distribution such that 
I(T, @)| < Cll@ll ys for all g € SCR”). Show that F(T) defines a bounded linear 


functional on the Hilbert space L?((1 + |£|7)*/? dé), and deduce that T is in H~* 
with ||T||_, < C. 


12. Lets > N/2. 

(a) Prove that if the tempered distribution T given by the function g € S(R”) 
is regarded as a member Ty of H*, then |I@llup < IIF@ly < CllToll ys 
where C is the constant Cian dae dé)" independent of ¢. 

(b) (Sobolev’s Theorem) Deduce from (a) that any member T of H* with 
s > N/2 is given by a bounded continuous function. 


Problems 13-20 concern the Hardy spaces H? (R2) for the upper half plane R2 = 
{z € C | Imz > 0}. These problems use complex analysis in one variable, and some 
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familiarity with the Poisson and conjugate Poisson kernels as in Chapters VII and IX 
of Basic will be helpful. The space H? (R2) is defined to be the vector subspace of 
analytic functions in the space H? (R4). Let f* be the Hardy—Littlewood maximal 


function of f on R!. Take as known the result from Basic that the Poisson integral 
P, * f satisfies |P, * f(x)| < Cf*(x) with C independent of f and y. 


13. 


14. 


15: 


16. 


17: 


18. 


Suppose that p satisfies 1 < p < oo, and let H : L?(R') > L?(R!) be the 

Hilbert transform. 

(a) Prove that if wo(x) is in L?(R!), then the Poisson integral of the function 
uo(x) + i(Hug)(x) is in H?(R!). 

(b) Conversely suppose that f(x +iy) is in H? (R}). Applying Theorem 3.25, 
let f(x + iy) be the Poisson integral of the member f(x) of L?(R}), If 
Re fo = uo, prove that Im fo = Huo. 


Prove that the functions f in L*(IR!) whose Poisson integrals are in the subspace 
H ?(R?) of H (Ri) are exactly the functions for which F f(x) = 0 ae. for 
x <0. 


Let F = (fi,..., fn) be an n-tuple of analytic functions on an open subset of 
C, and let (-, - ) be the usual inner product on C”. For a function on an open set 
in C, define f, = 4( fr — ify) and f; = 5( fx +if,), So that the condition for 
analyticity is f; = 0 and so that Af = 4 f,;. Suppose that F is nowhere 0 on an 
open set. Prove for all g > 0 that 


AQF) = @IFIPAF, FOP + 2qlF ITA (- 1, FP? +1 FPF?) 
> @|F\4*\(F, F’)/? = 0. 


Suppose that w is a smooth real-valued function on an open set in R™ containing 
the ball Br; xo)" such that Au > 0 on B(r; xo) and u < 0 on OB(r; x0). By 
considering u + c(|x — xol* — r’) for a suitable c, prove thatu < 0 on B(r; xo). 


Let f be in H'(R%), and define F, : {Imz > 0} > C? for ¢ > 0 by F,(z) = 

(f(z tie), e(z ti)~*). Define g,(x) = |F.(x)|'? for x € R. 

(a) Prove that |lgell3 < If lly: tell@ +47 lh. 

(b) Let g,(z) be the Poisson integral of g,(x). Show that |F,(z)|!/? and g,(z) 
both tend to 0 as |x| or y tends to infinity in R%. 

(c) By applying the previous two problems to | F;(z)| 
in R2., prove that | F.(z)|!/* < g.(z) on RY. 


1/2 _ 9. (z) on large disks 


By Alaoglu’s Theorem let g(x) be a weak-star limit in L*(IR!) of a sequence 

&e,(x) with €, | 0, and let g(z) be the Poisson integral of g(x). 

(a) Prove that | f(z)|'/* < g(z) < Cg*(x), with g*(x) being the Hardy— 
Littlewood maximal function of g(x). 
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(b) Conclude that | f(x + iy)| is dominated by the fixed integrable function 
g*(x)2 as y | 0. 

19. Let X be a locally compact separable metric space, let jz be a finite Borel 
measure on X, and suppose that {g,} is a sequence of Borel functions on X 
with |g,| < 1 such that the sequence {g,(x) di(x)} of complex Borel measures 
converges weak-star against Ccom(X ) to a complex Borel measure v. Prove that 
v is absolutely continuous with respect to jw. 


20. (F. and M. Riesz Theorem) Deduce from the above facts that each member 
of H! (R4) is the Poisson integral of an L! function on R!. 


Problems 21-24 show that the limit Tf = lim, )o T, f defining a Calder6n—Zygmund 
operator T exists almost everywhere for f € L? and 1 < p < o, as well as in 
L?. Let notation be as in the statement of Theorem 3.26 and Lemma 3.29: K (x) 
is a C! function on R” — {0} homogeneous of degree 0 with mean value 0 over the 
unit sphere, k(x) is K(x)/|x|" for |x| > 1 and is 0 for |x| < 1. For any function 
y on RN, define y,(x) = e-Ny(e~!x). The operator T, f is ke * f. Let f* be the 
Hardy—Littlewood maximal function of f. Take as known from Basic that if Y > 0 
is an integrable function on RY” of the form W(x) = Wo(|x|) with Yo nonincreasing 
and finite at 0, then sup,.g(¥, * f)(x) < Cw f*() for some finite constant Cy. Let 
f bein L? wihl<p<o. 
21. Let g be as in Proposition 3.5e. Define ® = T(g) — k. 
(a) Taking into account the fact that g is in ce (RY ), prove that T(g) is in 
C®(R*), and conclude that ® is locally bounded. 
(b) By taking into account the compact support of g, prove that |®(x)| is 
bounded by a multiple of |x|~"—! for large |x]. 
(c) Deduce that |®(x)| is dominated for all x by an integrable function Y (x) on 
RY” of the form U(x) = Wo(|x|) with Wo nonincreasing and finite at 0. 


22. Let g and ® be as in the previous problem. 
(a) Prove that (Ty), = Tg. 
(b) Prove the associativity formula Tg, * f = g, * (Tf). 
(c) Deduce that g, « (Tf) —k, * f = ®, « f. 
23. Conclude from the previous problem that there are constants C; and C2 indepen- 
dent of f such that sup,.9 |Te f(x)| < Ci f*(x) + Co(Tf)* (x). 


24. Why does it follow that lim, jo 7; f (x) exists almost everywhere? 


Problems 25—34 introduce Sobolev spaces in the context of multiple Fourier series. In 
this set of problems, periodic functions are understood to be defined on R and to be 
periodic of period 27 ineach variable. Write T for the circle R/27Z, and let C~(T) 
be the complex vector space of all smooth periodic functions. Let L*(T’) be the 
space of all periodic functions (modulo functions that are 0 almost everywhere) that 
are in L*([—m, ]%). If w = (ay, ..., ay) is a multi-index, a member f of L*(T™) 
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is said to have a weak a" derivative in L?(T) if there exists a function D® f in 
L2(T%) with 


Senn (D°f)g dx = (-1)/ on fD* pdx 


for all g in C~(T™). Define the Sobolev space L2(T™) for each integer k > 0 to 
consist of all members of L?(T) having a” derivative in L?(T) for all aw with 
|a| < k. The norm on ira ae is given by 


25 


28. 


29. 


30. 


31. 


32. 


IF liam) = DL Qm)™ fina |D*F? dx. 
|o|<k 


. Prove that L2(7) is complete. 
26. 
27. 


Prove that C~(T) is dense in L2(T) for all k > 0. 
Prove for each multi-index a and each k > 0 that there exists a constant Cy x 


such that 
DY Fl pzcrmy < Cakll f lz (TN) 


k-+lal 
for all f inC°(T’). 
Prove for each k > 0 that there is a constant A, such that every member f of 
L2(F* yhas 

Ifllzqy) Aw DO sup ID*F@)I. 


la|<k xe[—2,2] 


Prove for each integer k > 0 that there exist positive constants B, and C, such 
that B, > P™< (14 [l)*<cC, YS I. 


|a|<k |a|<k 
Prove that if f is periodic and locally integrable on R with multiple Fourier 
series f(x) ~ Dyegw ciel’, then f is in L2(T) if and only if 


> lei? + [1)* < oo. 
1eZN 


With notation as in the previous problem, prove for each k > 0 that there exist 
positive constants B, and C; independent of f such that 


Bull f Wagny) SD ler? A+ UPS < Cell face 
Rs tee (TN) 
for all f in L2(T). 

(Sobolev’s Theorem) Suppose that K is an integer with K > N/2. Prove that 


Diegy (+ |1)?)"* < 00, and deduce that any f in L4.(T) can be adjusted on 
a set of measure 0 so as to be continuous. 
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33. 


34. 
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Prove for each multi-index a that there exist some integer m(q@) and constant Cy 
such that 
sup |D°f (x)| < Coll fll, (TN) 
xée[—z,7] m(c) 

for all f inC°(T’). 
Prove that the separating family of seminorms || - || rn) OF C%(T™), indexed 

k 
by k, is equivalent to the family of seminorms sup,¢;_z,,j |D*(- )(x)|, indexed 
by a. Here “is equivalent to” is to mean that the identity map is uniformly 
continuous from the one metric space to the other. 


CHAPTER IV 


Topics in Functional Analysis 


Abstract. This chapter pursues three lines of investigation in the subject of functional analysis —one 
involving smooth functions and distributions, one involving fixed-point theorems, and one involving 
spectral theory. 

Section 1 introduces topological vector spaces. These are real or complex vector spaces with a 
Hausdorff topology in which addition and scalar multiplication are continuous. Examples include 
normed linear spaces, spaces given by a separating family of countably many seminorms, and weak 
and weak-star topologies in the context of Banach spaces. Various general properties of topological 
vector spaces are proved, and it is proved that the quotient of a topological vector space by a closed 
vector subspace is Hausdorff and is therefore a topological vector space. 

Section 2 introduces a topology on the space C°(U) of smooth functions on an open subset of 
RN. The support of a continuous linear functional on C°(U) is defined and shown to be a compact 
subset of U. Accordingly, the continuous linear functionals are called distributions of compact 
support. 

Section 3 studies weak and weak-star topologies in more detail. The main result is Alaoglu’s 
Theorem, which says that the closed unit ball in the weak-star topology on the dual of a normed linear 
space is compact. In an earlier chapter a preliminary form of this theorem was used to construct 
elements in a dual space as limits of weak-star convergent subsequences. 

Section 4 follows Alaoglu’s Theorem along a particular path, giving what amounts to a first 
example of the Gelfand theory of Banach algebras. The relevant theorem, known as the Stone 
Representation Theorem, says that conjugate-closed uniformly closed subalgebras containing the 
constants in B(S) are isomorphic via a norm-preserving algebra isomorphism to the space of all 
continuous functions on some compact Hausdorff space. The compact space in question is the space 
of multiplicative linear functionals on the subalgebra, and the proof of compactness uses Alaoglu’s 
Theorem. 

Sections 5—6 return to the lines of study toward distributions and fixed-point theorems. Section 5 
studies the relationship between convexity and the existence of separating linear functionals. The 
main theorem makes use of the Hahn—Banach Theorem. Section 6 introduces locally convex 
topological vector spaces. Application of the basic separation theorem from the previous section 
shows the existence of many continuous linear functionals on such a space. 

Section 7 specializes to the line of study via smooth functions and distributions. The topic is 
the introduction of a certain locally convex topology on the space C$>,,(U) of smooth functions of 
compact support on U. This is best characterized by a universal mapping property introduced in the 
section. 

Sections 8-9 pursue locally convex spaces along the other line of study that split off in Section 5. 
Section 8 gives the Krein—-Milman Theorem, which asserts the existence of a supply of extreme 
points for any nonempty compact convex set in a locally convex topological vector space. Section 9 
relates compact convex sets to the subject of fixed-point theorems. 
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Section 10 takes up the abstract theory of Banach algebras, with particular attention to com- 
mutative C* algebras with identity. Three examples are the algebras characterized by the Stone 
Representation Theorem, any L© space, and any adjoint-closed commutative Banach algebra 
consisting of bounded linear operators on a Hilbert space and containing the identity. 

Section 11 continues the investigation of the last of the examples in the previous section and 
derives the Spectral Theorem for bounded self-adjoint operators and certain related families of 
operators. Powerful applications follow from a functional calculus implied by the Spectral Theorem. 
The section concludes with remarks about the Spectral Theorem for unbounded self-adjoint operators. 


1. Topological Vector Spaces 


In this section we shall work with vector spaces over R or C, and the distinction 
between the two fields will not be very important. We write F for this field of 
scalars. A topological vector space or linear topological space is a vector space 
X over F with a Hausdorff topology such that addition, as a mapping X x X > X, 
and scalar multiplication, as a mapping F x X — X, are continuous. The 
mappings that we study between topological vector spaces are the continuous 
linear functions, which may be referred to as “continuous linear operators.” An 
isomorphism of topological vector spaces over F is a continuous linear operator 
with a continuous inverse. 

The simplest examples of topological vector spaces are the spaces F% of 
column vectors with the usual metric topology. Since the topologies of F’, 
F’ x F’, and F x F% are given by metrics, continuity of functions defined on 
any of these spaces may be tested by sequences. In particular, continuity of the 
vector-space operations on F% reduces to the familiar results about limits of sums 
of vectors and limits of scalars times vectors. Moreover, if L : FY > Y is 
any linear function from F% into a topological vector space over F, then L is 
continuous. To see this, let {e;, ..., ey} be the standard basis of column vectors, 
and let (-, -) be the standard inner product on F, namely the dot product if 
F = R and the usual Hermitian inner product if F = C. Write y; = L(e;). For 
any x in F’ , we have 


N N 
L(x) = >> @, e)L(ej) = De). 


j=l j=l 


If {x,} is asequence converging to x in F’ , then the continuity of the inner product 
forces (Xn, e;) > (x, e;) for each j. Then L(x,) tends to L(x) in Y since the 
vector space operations are continuous in Y. Hence L is continuous. 
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A second class of examples is the class of normed linear spaces. These were 
defined in Basic, and the continuity of the operations was established there.! 
The spaces F% of column vectors are examples. Further examples include the 
space B(S) of all bounded scalar-valued functions on a nonempty set S with the 
supremum norm, the vector subspace C (S) of continuous members of B(S) when 
S is a topological space, the vector subspaces Coom(S) and Co(S) of continuous 
functions of compact support and of continuous functions vanishing at infinity 
when S is locally compact Hausdorff, the space L?(X, jz) for 1 < p < © when 
(X, j4) is a measure space, and the space M(S) of finite regular Borel complex 
measures on a locally compact Hausdorff space with the total variation norm. 

A wider class of examples, which includes the normed linear spaces, is the class 
of topological vector spaces defined by seminorms. Seminorms were defined in 
Section III.1. If we have a family {|| - ||,} of seminorms on a vector space X over 
F, with indexing given by s in some nonempty set S, the corresponding topology 
on X is defined as the weak topology determined by all functions x +> ||x — y|l, 
for s ¢ Sand y € X. A base for the open sets of X is obtained as follows: For 
each triple (y, s,7), with y in X, with s one of the seminorm indices, and with 
r > 0, the set 1% | lx —yll, < r} is to be in the base, and the base consists of all 
finite intersections of these sets as (y, s, 7) varies. 

In order to obtain a topological vector space from a system of seminorms, we 
must ensure the Hausdorff property, and we do so by insisting that the only f 
in X with || f||, = 0 for all s is f = 0. In this case the family of seminorms is 
called a separating family. Let us go through the argument that a space defined 
by a separating family of seminorms is a topological vector space. 


Proposition 4.1. Let X be a vector space over F endowed with a separating 
family {|| - ||,} of seminorms. Then the weak topology determined by all functions 
x > ||x — y||, makes X into a topological vector space. 


PROOF. To see that X is Hausdorff, let xo and yo be distinct points of X. By 
assumption, there exists some s such that ||xo — yoll, iS a positive number r. The 
sets {x | |x — xoll, < r/2} and {y | lly — yoll, < r/2} are disjoint and open, and 
they contain xo and yo, respectively. Hence X is Hausdorff. 

To see that addition is continuous, we are to show that if a net {(xg, Ya)} is con- 
vergent in X x X to (Xo, yo), then {xy + yy} converges to x9 + yo. This means that 
if ||Xe —Xoll, +ll¥a — yoll, tends to 0 for each s, then || (x. +ye) — (xo + yo) ||, tends 
to 0 for each s. This is immediate from the triangle inequality for the seminorm 
| - |,.and hence addition is continuous. The proof that scalar multiplication is 
continuous is similar. 


'The definition appears in Section V.9 of Basic, and the continuity of the operations is proved in 
Proposition 5.55. 
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We have encountered two distinctly different kinds of examples of topological 
vector spaces defined by families of seminorms. In the first kind a countable 
family of seminorms suffices to define the topology. Normed linear spaces are 
examples. So is the Schwartz space S(R” ), consisting of all smooth scalar-valued 
functions on R™ such that the product of any polynomial with any iterated partial 
derivative of the function is bounded. The defining seminorms for the Schwartz 
space are 


Ilfllp.o = sup |P(x)(Q(D) f)(~)I, 
xERN 


where P and Q are arbitrary polynomials. We saw in Section III.1 that the same 
topology arises if we use only the countably many seminorms for which P is 
some monomial x® and Q is some monomial x’. This family of seminorms is a 
separating family because if || f||,,; =0, then f =0. 

Another example of a topological vector space whose topology can be defined 
by countably many seminorms is the space C (U) of smooth scalar-valued func- 
tions on anonempty open set U of R% with the topology of uniform convergence 
on compact sets of all derivatives. The family of seminorms is indexed by pairs 
(K, P) with K acompact subset of U and with P a polynomial, the corresponding 
seminorm being || fllx p = SUPyex |(P(D) f)(x)|. The Hausdorff condition is 
satisfied because if || fl, , = 0 for all K, then f = 0. We shall see in the 
next section that the topology can be defined by a countable subfamily of these 
seminorms. 

Still a third space of smooth scalar-valued functions, besides S(R”) and 
C™(U), will be of interest to us. This is the space CS, (U ) of smooth functions on 
a nonempty open U with compact support contained in U. The useful topology 
on this space is more complicated than the topologies considered so far. In 
particular, it cannot be given by countably many seminorms. Describing the 
topology requires some preparation, and we come back to the details in Section 7. 

The examples we have encountered of topological vector spaces defined by 
an uncountable family of seminorms, but not definable by a countable family, 
are qualitatively different from the examples above. Indeed, they lead along a 
different theoretical path, as we shall see—one that takes us in the direction of 
spectral theory rather than distribution theory. 

The first class of such examples is the class of normed linear spaces X with 
the “weak topology,” as contrasted with the norm topology. Let X* be the set 
of linear functionals of X that are continuous in the norm topology. The weak 
topology on X was defined in Chapter X of Basic as the weakest topology that 
makes all members of X* continuous. Of course, any set that is open in the weak 
topology on X is open in the norm topology. A base for the open sets in the weak 
topology on X is obtained as follows: For each triple (xo, x*, 7), with xo in X, x* 
in X*,andr > 0, the set {x | |x*(x —x09)| < r} is to be in the base, and the base 
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consists of all finite intersections of these sets as (xo, x*,r) varies. The weak 
topology is given by the family of seminorms |] - ||,« = |x*(-)|. The proof that 
the weak topology is Hausdorff requires the fact, for each x 4 0 in X, that there 
is some member x* with x*(x) 4 0; this fact is one of the standard corollaries of 
the Hahn—Banach Theorem. Examples of weak topologies will be discussed in 
Section 3. 

Similarly the weak-star topology on X*, when X is a normed linear space, 
was defined in Basic as the weakest topology on X* that makes all members of 
X continuous. This is given by the family of seminorms || - ||, =| - («)|. Here 
the relevant fact for seeing that the topology is Hausdorff is that for each x* 4 0 
in X*, there is some x in X with x*(x) 4 0. This is just a matter of the definition 
of x* ~ 0 and depends on no theorem. Examples of weak-star topologies will be 
discussed in Section 3. 

The above classes of examples by no means exhaust the possibilities for topo- 
logical vector spaces. Let us mention briefly one example that is not even close 
to being definable by seminorms. It is the space L([0, 1]) withO < p < 1. This 
is the vector space of all real-valued Borel functions on [0, 1] with Sioa | f|P dx 
finite ,except that we identify two functions if they differ only on a set of measure 0. 
Let us see that d(f, g) = Sto. | f — g|? dx is a metric. We need only verify the 
triangle inequality in the form Sro.1 |\f+el?dx < Sto. |flPdx + tro. lel? dx. 
To check this, we observe for nonnegative r that (1+7r)? —-(1+r?’)isOatr =0 
and has negative derivative p((1 + r)P-! —rP—!) since p — 1 is negative. Thus 
(+r)? < 1+r? forr > 0,andconsequently |a+b|? < (Ja|+|b|)? < |a|/?+|b|? 
for all real a and b. Taking a = f(x) and b = g(x) and integrating, we obtain the 
desired triangle inequality. One readily shows that L?([0, 1]) with this metric is a 
topological vector space. On the other hand, this topological vector space is rather 
pathological, as is shown in Problem 8 at the end of the chapter. For example it 
has no nonzero continuous linear functionals, whereas nonzero topological vector 
spaces whose topologies are given by seminorms always have enough continuous 
linear functionals to separate points.” 

Now we turn our attention to a few results valid for arbitrary topological vector 
spaces. 


Proposition 4.2. In any topological vector space, the closure of any vector 
subspace is a vector subspace. 


PROOF. Let V be a vector subspace of the topological vector space X. If x and 
y are in V“, then (x, y) isin V“! x V°! = (V x V)*!. Any continuous function 


More precisely it will be observed in Section 6 that topological vector spaces whose topologies 
are given by seminorms are “locally convex,” and it will be proved in that same section that locally 
convex spaces always have enough continuous linear functionals to separate points. 
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f has the property for any set S that f(S‘') C f(S)*'. Applying this fact to the 
addition function, we see that x + y is in V" since V is the image of V x V under 
addition. Thus V“ is closed under addition. Similarly V“ is closed under scalar 
multiplication. 


Lemma 4.3. If X is a real or complex vector space in which addition and 
scalar multiplication are continuous and if {0} is a closed subset of X, then X is 
Hausdorff and hence is a topological vector space. 


PROOF. Since translations are homeomorphisms, it is enough to separate 0 and 
an arbitrary x 4 0 by disjoint open neighborhoods. Since X — {0} is open, so 
is V = X — {x}. By continuity of subtraction, choose an open neighborhood U 
of 0 such that the set of differences satisfies U - U C V. Then U andx+U 
are open neighborhoods of 0 and x. If y is in their intersection, then y is in U, 
and y is of the form x + u for some u in U. Hence x = y — u exhibits x as in 
U—U CV =X — {x}, contradiction. Thus we can take U and x + U as the 
required disjoint open neighborhoods of 0 and x. 


Proposition 4.4. If X is a topological vector space, if Y is a closed vector 
subspace, and if the quotient vector space X/Y is given the quotient topology,* 
then X/Y is a topological vector space, and the quotient map g : X —> X/Y 
carries open sets to open sets. 

PrRoor. If U is open in X, then q(qWU)) = User (y + U) exhibits 
q—'(q(U)) as the union of open sets and hence as an open set. By definition 
of the topology on X/Y,q(U) is open in X/Y. Hence q carries open sets in X 
to open sets in X/Y. 

To see that addition is continuous in X/Y, let x; and x2 be in X, and let E be 
an open neighborhood of the member x; + x2 + Y of X/Y. Then q~!(E) is an 
open neighborhood of x; + x2 in X. By continuity of addition in X, there exist 
open neighborhoods U, of x; and U2 of x2 such that U; + Uz © q \(E). The 
map q is open and linear, and hence q(U,) and q(U2) are open subsets of X/Y 
with g(U;) + g(U2) € q(q7!(E)) = E. Thus addition is continuous in X/Y. 

To see that scalar multiplication is continuous in X/Y, let c be a scalar, let x be 
in X, and let E be an open neighborhood of cx in X/Y. Then g~!(E) is an open 
neighborhood of cx in X. By continuity of scalar multiplication in X, there exist 
open neighborhoods A of c in the scalars and U of x in X suchthat AU C q7!(E). 
Then g(U) is an open subset of X/Y such that Ag(U) € q(q7!(E)) = E. Hence 
scalar multiplication is continuous in X/Y. 

Applying Lemma 4.3, we see that X/Y is Hausdorff. Therefore X/Y is a 
topological vector space. 


31f q : X —> X/Y is the quotient mapping, the open sets E of X/Y are defined as all subsets 
such that q (E) is open in X. 
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Proposition 4.5. If Y is an n-dimensional topological vector space over F, 
then Y is isomorphic to F”. 


PROOF. Let y,,..., y, be a vector-space basis of Y, and let (-, -) and | - | 
be the usual inner product and norm on F”. If e1,..., é, is the standard basis of 
F", define L( )°7_, cje;) = )°j_, ¢7y;. Then L is one-one and hence is onto Y. 
We saw earlier in this section that L is continuous. We shall prove that L~! is 
continuous, and it is enough to do so at O in Y. 

Assuming on the contrary that L~! is not continuous at 0, we can find some 
€ > 0 such that no open neighborhood U of 0 in Y maps under L~! into the 
open neighborhood {|x| < €} of 0 in F”. For each such U, find yy in U with 
|L~!(yy)| > €. Define zy = |L7!(yy)|7!yy. The net {yy} tends to 0 in Y by 
construction, and the numbers |L~!(yy)|~! are bounded by e~!. By continuity 
of scalar multiplication in Y, zy has limit 0 in Y. On the other hand, the members 
of F” defined by xy = L7!(zy) = |L7!(yv)|7!L7! (Qu) have |xu| = 1 for all 
U. The unit sphere in F” is compact, and it follows that {xy} has a convergent 
subnet, say {xy,,}, with some limit xo such that |xo| = 1. We have L(xy) = zu, 
and passage to the limit gives L(xo) = lim, L(xu,,) = lim, zy, = 0. On the 
other hand, L is one-one, and hence the equality L(xp) = 0 for some xq with 
|xo| = 1 is a contradiction. We conclude that L~! is continuous. 


Corollary 4.6. Every finite-dimensional vector subspace of a topological 
vector space is closed. 


PROOF. Let V be an n-dimensional subspace of a topological vector space X, 
and suppose that V‘! properly contains V. Choose xo in V‘! — V, and form the 
vector subspace W = V + Fxo. Then the closure of V in W, being a vector 
subspace (Proposition 4.2), is W. The vector subspace W has dimension n + 1, 
and Proposition 4.5 shows that W is isomorphic to F”+!. All vector subspaces of 
F"+! are closed in F”*!, and hence V is closed in W, contradiction. 


Lemma 4.7. If X is a topological vector space, K is a compact subset of X, 
and V is an open neighborhood of 0, then there exists € > 0 such that 5K C V 
whenever |6| < €. 


PROOF. For each k € K, choose €, > O and an open neighborhood U; of k 
such that 5U; CG V whenever |4| < €;; this is possible since scalar multiplication 
is continuous at the point where the scalar is 0 and the vector is k. The open sets 
U, cover K, and the compactness of K implies that there is a finite subcover: 
K CU,, U--»UU,,,. Then dK C V whenever |6| < minj<j<m &;- 


Proposition 4.8. Every locally compact topological vector space is finite 
dimensional. 
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PROOF. Let X be a locally compact topological vector space, let K be a 
compact neighborhood of 0, and let U be its interior. Suppose that we have a 
sequence {y,,} in X with the property that for any 6 > 0, there is an integer M 
such that m > M implies y,, lies in 6K. Then the result of Lemma 4.7 implies 
that {y,,} tends to 0. 

The sets {k+5U | k € K} form an open cover of K. If {ki +4U, rates ky+5U} 
is a finite subcover, we prove that {ki,..., kn} spans X. It is enough to prove that 
S = {k,,...,kn} spans U. If x is in U, then x is in one of the sets of the finite 
subcover, say kj, + 5U . Write x = kj, + Sul accordingly. The finite subcover 
covers K and hence its interior U, and thus 5U is covered by 5 (ki + 5U ) eee 
5 (kn + 5U ). Applying this observation to the element Su of 5U , we See that x 
is in kj, + 5 (kis oe 5U) for some k;,. Write x = kj, + skin oe iu accordingly. 
Continuing in this way, we see that 


x is in kj, 4 5 kjy bees sh kj. 4 + U for each r. 
Put x, = kj, 4 5 kj, fee <t , k;.. This is an element of the finite-dimensional 


subspace spanned by S, which is closed by Corollary 4.6; thus if {x,} converges, 
it must converge to a member x9 of this subspace. Using the result of the previous 
paragraph, we shall show that x — x, converges to 0. Then we can conclude that 
x, converges to x, hence that x is in the span of S. To see that x — x, converges 
to 0, choose / such that |59| < 2~! implies 59K © U. Applying the criterion of the 
previous paragraph, let 5 > 0 be given. Choose M such that2~“5—! < 2~!. Then 
m > M implies that 27'S 1 <2-Mg-! < 27! Thus 27’"5~! is an allowable 
choice of 59, and we therefore obtain 2-"6-'K C U and 2-"K C 8U. For 
m > M, the element x — x,, liesin2~”U C 27K, and we have just proved that 
2-"K C 8U. Thus x — x,, lies in 6U, and the criterion of the previous paragraph 
applies. Hence x — x,, tends to 0. This completes the proof. 


2. C~(U), Distributions, and Support 


As was mentioned in Section III.1, distributions are continuous linear func- 
tionals on vector spaces of smooth functions. Their properties are deceptively 
simple-looking and enormously helpful in working with linear partial differential 
equations. We considered tempered distributions in Section III.1; these are the 
continuous linear functionals on the space S(IR”) of Schwartz functions on R™. 
In this section we study the topology on the space C°(U) of arbitrary scalar- 
valued smooth functions on an open subset U of RY , together with the associated 
space of distributions. 

To topologize C™~ (U), we use the family of seminorms indexed by pairs (K, P) 
with K a compact subset of U and with P a polynomial, the (K, P) seminorm 
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being || f lle p = SUPyex |(P(D) f)(x)|. The resulting topology is Hausdorff, 
and C®(U) becomes a topological vector space. 

Let us see that this topology is given by a countable subfamily of these semi- 
norms and is therefore implemented by a metric. It is certainly sufficient to 
consider only the monomials D® instead of all polynomials P(D), and thus the 
P index of (K, P) can be assumed to run through a countable set. We make use 
of a notion already used in Section II.2. An exhausting sequence of compact 
subsets of U is an increasing sequence of compact sets with union U such that 
each set is contained in the interior of the next set. An exhausting sequence 
exists in any locally compact separable metric space. If {K,,} is an exhausting 
sequence for U and if K is a compact subset of U, then the interiors K? of 
the K,,’s form an open cover of K, and there is a a finite subcover; since the 
members of the open cover are nested, K is contained in some single K;? and 
hence in Ky. Therefore || fllk p < Ilfllx,.p for every P, and we can discard 
all the seminorms except the ones from some K,,. In short, the countably many 
seminorms || f |x, .« = SUPyex, |((D°f)(x)| suffice to determine the topology of 
c™(U). In particular, the topology is independent of the choice of exhausting 
sequence. 

After the statement of Theorem 3.9, we constructed a smooth partition of unity 
{Wn}n>1 associated to an exhausting sequence {K,,},>1 of an open subset U of 
IR’. Such a partition of unity is sometimes useful, and Problem 9 at the end of 
the chapter illustrates this fact. The functions w, are in C°(U) and have the 
properties that ae Wn(x) = lon U, W(x) > 0 on K3, wi (x) = 0 on (KQ)°, 
and for n > 2, 

>0 for x € Kyy2 — K? 
Vn(x) ei as . ie 
=0 for x € (K?,3)° U Kp. 

Since C®(U) is a metric space, its topology may be characterized in terms of 
convergence of sequences: a sequence of functions converges in C°(U) if and 
only if the functions converge uniformly on each compact subset of U and so do 
each of their iterated partial derivatives 

If a particular metric for C°(U) is specified as constructed in Section III.1 
from an enumeration of some determining countable family of seminorms, then 
it is apparent that a sequence of functions is Cauchy in C®(U) if and only if the 
functions and all their iterated partial derivatives are uniformly Cauchy on each 
compact subset of U. As a consequence we can see that C°°(U) is complete as a 
metric space: in fact, let us extract limits from each uniformly Cauchy sequence of 
derivatives and use the standard theorem on derivatives of convergent sequences 
whose derivatives converge uniformly; the result is that we obtain a member of 
C™(U) to which the Cauchy sequence converges. 
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It is unimportant which particular metric is used for this completeness argu- 
ment. The relevant consequence is that the Baire Category Theorem’ is applicable 
to C*(U), and the statement of the Baire Category Theorem makes no reference 
to a particular metric. 

In similar fashion one checks that S(R), whose topology is likewise given 
by countably many seminorms, is complete as a metric space. 

The vector space of continuous linear functionals on C* (U), i.e., its continu- 
ous dual, is called the space of all distributions of compact support on U and is 
traditionally? denoted by €’(U). The words “of compact support” require some 
explanation and justification, which we come back to after giving an example. 


EXAMPLE. Take finitely many complex Borel measures p,, of compact support 
on U, the indexing being by the set of n-tuples w of nonnegative integers with 
|a| < m, and define 


[ox > [206 aves. 


|a|<m 


It is easy to check that T is a distribution of compact support on U. A theorem 
in Chapter V will provide a converse, saying essentially that every continuous 
linear functional on C®(U) is of this form. 


Let us observe that the vector subspace C3<,(U) is dense inC(U). In fact, let 


{ K;} be an exhausting sequence of compact sets in U, and choose w; € C&S, (R”) 
by Proposition 3.5f to be 1 on K; and 0 off Kj+1. If f is in C°(U), then jf is 
in Co&\,(U) and tends to f in every seminorm on C™(U). 

To obtain a useful notion of “support” for a distribution, we need the following 


lemma. 


Lemma 4.9, If U; and U2 are nonempty open sets in R™ and if g is in 
Cxm(U1 U U2), then there exist g; € Coo,(U1) and g2 € C&,,(U2) such that 
G=P1T G2. 


PROOF. Let L be the compact support of g, and choose a compact set K such 
that C K° C K C U,;UU». Then {U;, Uz} is a finite open cover of K, 
and Lemma 3.15b of Basic produces an open cover {V;, V2} of K such that V(" 
is a compact subset of U; and V5! is a compact subset of U2. Proposition 3.5f 
produces functions g; € C&,(U1) and g2 € C&S (U2) with values in [0, 1] such 


com 
that g; is 1 on Ve and g2 is | on Ven Then g = gi + g2 isin C&S, (U; U U2) and 
“Theorem 2.53 of Basic. 
5The tradition dates back to Laurent Schwartz’s work, in which €(U) was the notation for C®(U) 
and €'(U) was the space of continuous linear functionals. 
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is | on K. If W is the open set where g ¥ 0, then Proposition 3.5f produces a 
function h in CS. (W) with values in [0, 1] such that 4 is 1 on K. The function 
1 — h is smooth, has values in [0, 1], is 1 where g 4 0, and is 0 on K. Hence 
g + (1 —A) is a smooth function that is everywhere positive on R% and equals g 
on K. Therefore the functions g;/(g + 1 — A) and g2/(g + 1 — A) are smooth 
functions on R“ compactly supported in U, and U2, respectively, with sum equal 
to 1 on K. If we define g; = gig and g2 = gq, then g and @> have the required 
properties. 


Proposition 4.10. If T is an arbitrary linear functional on Co, (U) and if U’ 
is the union of all open subsets U, of U such that T vanishes on C&,(U,), then 


com 
T vanishes on Co,(U'). 


PROOF. Let g be in CS, (U’), and let K be the support of y. The open sets 
U, form an open cover of K, and some finite subcollection must have K C 
Uy, U---UUy,. Lemma 4.9 applied inductively shows that g is the sum of 
functions in C&,(U;), 1 < j < p. Since T is 0 on each of these, it is 0 on the 


com 
sum. 


If T isin €’(U), the support of T is the complement of the set U’ in Proposition 
4.10, i.e., the complement of the union of all open sets U, such that T vanishes on 
CS ,(U,). If T has empty support, then T = 0 because T vanishes on C&,,(U) 
and C&°_(U) is dense in C~(U). 


com 


Proposition 4.11. Every member T of €’(U) has compact support. 


REMARKS. For the moment this proposition justifies using the name “distri- 
butions of compact support” for the continuous linear functionals on C*°(U). 
After we define general distributions in Section V.1, we shall have to return to 
this matter. 


PROOF. Let {K,,} be an exhausting sequence of compact sets in U. If T is not 
supported in any K,,, then there is some f, in Coo,(U — K,) with T(f,) 4 0. 
Put gn = fn/T (fn), so that T(g,) = 1. If K is any compact subset of U, then 
K C K, for large n, and Bal = 0 for such n. Thus g, tends to 0 in C*(U) 


while 7 (g,) tends to 1 4 0 = 7 (0), in contradiction to continuity of T. 


Similarly we can use Proposition 4.10 to define the support of a tempered 
distribution T in S’(R%) as the complement of the union of all open sets U;, such 
that T vanishes on C&,,(U,,). Tempered distributions need not have compact 
support; for example, the function 1 defines a tempered distribution whose support 


is RY. 
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In the case of tempered distributions, a little argument is required to show that 
the only tempered distribution with empty support is the 0 distribution. What is 
needed is the following fact. 


Proposition 4.12. C& (IR) is dense in S(R). 


com 


REMARKS. If T in S'(R”) has empty support, then T vanishes on C®_(R¥). 


Proposition 4.12 and the continuity of T imply that T = 0 on S(R”). Thus the 
only tempered distribution with empty support is the O distribution. 


PROOF. Fix h in Ce (R*) with values in [0, 1] such that h(x) is 1 for |x| < 1 
and is 0 for |x| > 2. Define hpr(x) = h(R7!x). If g is in S(RY), we shall 
show that limr_... ry = in the metric space S(IR” ), and then the proposition 
will follow. Thus we want limr_.oo SUP, ep |x” D*(@ — hrgv)(x)| = 0. By 
the Leibniz rule, D*(hrp) = hrD°g + Voge g cg(D*Phr)(D%e). Hence it is 
enough to prove that 

lim sup |x”(1 —hr)D%g| =0 
R>oo xeERN 
and lim sup |x”(D* ’hpr)(D%y)| =0 for B <a. 


R>oo xeRY 


The first of these limit formulas is a consequence of the fact that x” D®g van- 
ishes at infinity, which in turn follows from the fact that x”(1 + |x|?)D%@ is 
bounded, 1.e., that Ilva + fxf?) 2 is finite. For the second of these limit formu- 


las, we observe from the chain rule that D’~’h p(x) = R7!*-8| D*-8n(R-!x). 
For 6 < a, this function is dominated in absolute value by CyR~!. Hence 
SUP, eRN |x” (D°-Phpr)(D8Q)| < cyR7! Des IP ll.» ,¢» and the limit on R is 0. 


3. Weak and Weak-Star Topologies, Alaoglu’s Theorem 


Let X be a normed linear space, and let X* be its dual, which we know to be 
a Banach space. We have defined the weak topology on X to be the weakest 
topology on X making all members of X* continuous, i.e., making x > x*(x) 
continuous for each x* in X*. This topology is given by the family of seminorms 
|x |lx* = |x*(x)| indexed by X*. The weak-star topology on X* relative to X 
is the weakest topology on X* making all members of 1(X) continuous,° ie., 
making x* +» x*(x) continuous for each x in X. This topology is given by 
the family of seminorms ||x*||,, = |x*(x)| indexed by X. In this section we 


The symbol : denotes the canonical map X — X** given by 1(x)(x*) = x*(x). 
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study these topologies’ in more detail, proving an important theorem about the 
weak-star topology. 

We shall discuss some examples in a moment. The space X* is a normed linear 
space in its own right, and therefore it has a well-defined weak topology. The 
definitions make the weak topology on X* the same as the weak-star topology on 
X* relative to X if X is reflexive, but we cannot draw this conclusion in general. 

The weak topology on X is of less importance to real analysis than the weak- 
star topology on X*, and thus the main interest in the weak topology on X will 
be in the case that X is reflexive. It is also true that exact conditions that interpret 
the weak or weak-star topology in a particular example tend not to be useful. 
Nevertheless, it may still be helpful to consider examples in order to get a better 
sense of what these topologies do. 

We shall discuss the examples in terms of convergence. However, the conver- 
gence will involve only convergence of sequences, not convergence of general 
nets. A difficulty with nets is that one cannot draw familiar conclusions from 
convergence of nets even in the case of nets in the real numbers; for example, a 
convergent net of real numbers need not be bounded, just eventually bounded. 

In order to have it available in the discussion, we prove one fact about con- 
vergence of sequences in weak and weak-star topologies before coming to the 
examples. 


Proposition 4.13. Let X be a normed linear space, and let X* be its dual space. 


(a) If {x,} is a sequence in X converging to x9 in the weak topology on X, then 
{||Xn ||} is a bounded sequence in R and ||xo|| < lim inf, ||x,|]. 

(b) If X is a Banach space and if {x7} is a sequence in X* converging to xj in 
the weak-star topology on X* relative to X, then {||x7 ||} is a bounded sequence 
in R and ||x5|| < lim inf, ||x;\[. 


PROOF. For the first half of (a), let. : X — X* be the canonical map. Since 
the sequence {t(x,)(x*)} converges to x*(xo) for each x* in X*, {t(x,)} is a set 
of bounded linear functionals on the Banach space X* with {t(x,)(x*)} bounded 
for each x* in X*. By the Uniform Boundedness Theorem the norms ||¢(x;)|| 
are bounded. Since ¢ preserves norms as a consequence of the Hahn—Banach 
Theorem, the norms ||x,|| are bounded. For the second half of (a), let x* be 
arbitrary in X* with ||x*|| < 1. Then 


|x*(xo)| = lim |x*(%n)| < lim inf ||" [len] < lim inf || xn]. 


Taking the supremum over x* with ||x*|| < 1 and applying the formula ||xo|| = 
SUP ]x*|/<1 |x*(x9)|, which is known from the Hahn—Banach Theorem, we obtain 
I|xo|] < lim inf ||xn|. 


7The weak topology on X is also called the X* topology of X, and the weak-star topology on 
X* is also called the X topology of X*. 
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For the first half of (b), {x7} is a set of bounded linear functionals on the Banach 
space X with {x7 (x)} bounded for each x in X. Then the Uniform Boundedness 
Theorem shows that the norms ||x;|| are bounded. For the second half of (b), let 
x be arbitrary in X with ||x|| < 1. Then 


|x (x)| = lim |x*(x)| < lim inf |]x* ||| |] < lim inf ||x*|]. 


Taking the supremum over x and applying the definition of ||xq||, we obtain 
I|xq ll < lim inf |x; I. 


EXAMPLES OF CONVERGENCE IN WEAK TOPOLOGIES. 

(1) X = L?(S, w) when 1 < p < oo. Then X* = L?’(X, y), where p’ is 
the dual index® of p. The assertion is that a sequence { f,} tends weakly to f 
in L? if and only if {|| fnll,,} is bounded and lim [, fndu = J, f du for every 
measurable subset F of S of finite measure. The necessity is immediate from 
Proposition 4.13a and from taking the member of X™* to be the indicator function 
of E. Let us prove the sufficiency. From lim f,, frdu = J, f du, we see that 
lim [, fnt du = J; ft du for t simple if t is 0 off a set of finite measure. Let g 
be given in L? (S, jz), and choose a sequence {t,,} of simple functions equal to 0 
off sets of finite measure such that lim,, t = g in the norm topology of L?’. For 
all m and n, we have 


[Is frgdu— fs fe du| 
<| fy falg —tm) dul +| fo fatm de — fy ftm du| 
+ | fs fm — g)du| 
< II fallpll@—tnlly + | Ls atm de— Jy ftm due] + If llplle—tnll 


The first and third terms on the right tend to 0 as m tends to infinity, uniformly in 
n. If € > 0 is given, choose m such that those two terms are < €, and then, with 
m fixed, choose n large enough to make the middle term < €. 


(2) X = C(S) with S compact Hausdorff, C (S) being the space of continuous 
scalar-valued functions on S. Then X* may be identified with the space M(S) of 
(signed or) complex regular Borel measures on S, with the total-variation norm.? 
The assertion is that a sequence {f,} tends weakly to f in C(S) if and only if 
{|| fnll} is bounded and lim f, = f pointwise. The necessity is immediate from 
Proposition 4.13a and from taking the member of X* to be any point mass at a point 


8The index p’ is defined by ; + 7 = 1. This duality was proved in Theorem 9.19 of Basic 
when s is o-finite, but it holds without this restrictive assumption on jz. 

°This identification was obtained in Basic in Theorem 11.24 for real scalars and in Theorem 11.26 
for complex scalars. The starting point for the identification is the Riesz Representation Theorem. 
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of S. For the sufficiency we simply observe that any member of M(S) is a finite 
linear combination of regular Borel measures jz on S and lim i. sind = af sf dp 
for any Borel measure jz by dominated convergence. 


(3) X = Co(S) with S locally compact separable metric, Co(S) being the space 
of continuous scalar-valued functions vanishing at infinity. Again the dual X* 
may be identified with the space M(S) of complex regular Borel measures on 
S, with the total-variation norm. This example can be handled by applying the 
previous example to the one-point compactification of S. All signed or complex 
Borel measures are automatically regular in this case. A sequence {f,} tends 
weakly to f in Co(S) if and only if {|| f,||} is bounded and lim f,, = f pointwise. 


EXAMPLES OF CONVERGENCE IN WEAK-STAR TOPOLOGIES. 

(1) X = L?(S,) and X* = L?(S,w) when 1 < p < ©, p’ being the 
dual index of p. This X is reflexive. Therefore the first example of convergence 
in weak topologies shows that { f,} converges weak-star in L”’(S, 2) relative to 
L?(S, ) if and only if {|| full,,} is bounded and lim | jedi [op ator 
every measurable subset F of S of finite measure. 

(2) X = L'(S, w) and X* = L™(S, 2) when w is o-finite. This X is usually 
not reflexive. However, the condition for weak-star convergence is the same 
as in the previous example: {/,} converges weak-star in L°(S, jz) relative to 
L'(S, w) if and only if {|| frll,,} is bounded and lim J, frdu = fi, f du for 
every measurable subset E of S of finite measure. The argument in the first 
example of convergence in weak topologies can easily be modified to prove this. 


(3) X = C(S) with S compact Hausdorff, and X = Co(S) with S locally 
compact separable metric. Weak-star convergence of complex regular Borel 
measures does not have a useful necessary and sufficient condition beyond the 
definition. The notion of weak-star convergence in this situation is, nevertheless, 
quite helpful as a device for producing new complex measures out of old ones.!° 


A theorem about the weak topology, due to Banach, is that the vector subspaces 
that are closed in the weak topology are the same as the vector subspaces that are 
closed in the norm topology. More generally the closed convex sets coincide in 
the weak and norm topologies. We shall not have occasion to use this theorem or 
mention any of its applications, and we therefore omit the proof. 

The weak-star topology has results of more immediate interest, and we turn 
our attention to those. Theorem 5.58 of Basic established for any separable 
normed linear space X that any bounded sequence in the dual X* has a weak- 
star convergent subsequence; this was called a “preliminary form of Alaoglu’s 
Theorem.” 


'0Warning. Many probabilists and some other people use the unfortunate term “weak conver- 
gence” for this instance of weak-star convergence. 
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Theorem 4.14 Let X be a normed linear space with dual X™*. 


(a) (Alaoglu’s Theorem) The closed unit ball of X* is compact in the weak- 
star topology relative to X. 

(b) If X is separable, then the closed unit ball of X* is a separable metric space 
in the weak-star topology. 


REMARKS. By (a), any net {x} in X* with ||xZ|| bounded has a subnet (xg, 
and an element x9 in X* such that Xe, (x) — x9(x) for every x in X. By (b), 
this conclusion about nets can be replaced by a conclusion about sequences if 
X is separable. Thus we recover the “preliminary form” of Alaoglu’s Theorem. 
The results of Section III.4 give an example of the utility of the two parts of this 
theorem; together they lead to a proof that harmonic functions in H? (RYT) are 
automatically Poisson integrals of functions if p > 1 or of complex measures if 
p=l. 

PROOF. Let B be the closed unit ball in X*, let D(r) be the closed disk in C 
with radius r and center 0, and letC = X ..yD((lx||). Define F : B > C by 
F(x*) = & ,eyx*(x). The function F is well defined since |x*(x))| < ||x|| for 
all x* in B and all x in X. It is continuous as a map into the product space since 
x* +» x*(x) is continuous for each x, it is one-one since x* is determined by 
its values on each x, and it is a homeomorphism with its image by definition of 
weak topology. Since C is compact by the Tychonoff Product Theorem, (a) will 
follow if it is shown that F'(B) is closed in C. Let p, denote the projection of 
C to its x" coordinate. If x and x’ are in X and if { f,} is a net in C convergent 
to fo in C, then an equality py+./(fo) = Px(fa) + px (fu) for all @ implies that 
Px+x'(fo) = Px(fo) + px (fo) by continuity of py+., px, and p,. Thus the set 


SG.) = {f Ec | Pxtx'(f) = Px(f) + pe (f)} 


is closed, and similarly the set 


T(x,c)={f €C | cpx(f/) = px(cf)} 


is closed. The intersection of all S(x, x’)’s and all T(x, c)’s is the set of linear 
members of C, hence is exactly F(B). Thus F(B) is closed. 

For (b), we continue with B and D(r) as above, but we change C and F 
slightly. Let {x,} be a countable dense set in the norm topology of X, let C = 
X ,, Ddllxnll), and define F : B > C by F(x*) = X 1%" On). As in the 
proof of (a), F is continuous. It is one-one since any x*, being continuous, is 
determined by its values on the dense set {x,,}. The domain is compact by (a). The 
range space C is a separable metric space and is in particular Hausdorff. Hence 
B is exhibited as homeomorphic to F(B), which is a subspace of the separable 
metric space C and is therefore separable. 
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4. Stone Representation Theorem 


In this section we begin to follow Alaoglu’s Theorem along paths different from 
its use for creating limit functions and measures out of sequences that are bounded 
in a weak-star topology. We shall work in this section with what amounts to an 
example—one of the motivating examples behind a stunning idea of I. M. Gelfand 
around 1940 that brings algebra, real analysis, and complex analysis together in 
a profound way. The example gives a view of subalgebras of the algebra B(S) 
of all bounded functions on a set S in terms of compactness. The stunning idea 
that came out, on which we shall elaborate shortly, is that the mechanism in the 
proof is the same mechanism that lies behind the Fourier transform on R%, that 
this mechanism can be cast in abstract form as a theory of commutative Banach 
algebras, and that the theory gives a new perspective about spectra. In particular, 
it leads directly to the full Spectral Theorem for bounded and unbounded self- 
adjoint operators, extending the theorem for compact self-adjoint operators that 
was proved as Theorem 2.3. In turn, the Spectral Theorem has many applications 
to the study of particular operators. 

Let us first state the theorem about B(S), then discuss Gelfand’s stunning idea 
about the mechanism, and finally give the proof of the theorem. We shall pursue 
the Gelfand idea in Sections 10-11 later in this chapter. 

We have discussed B(S) as the Banach space of bounded complex-valued 
functions on a nonempty set S, the norm being the supremum norm. In this 
Banach space pointwise multiplication makes B(S) into a complex associative 
algebra!! with identity (namely the function 1), there is an operation of complex 
conjugation, and there is a notion of positivity (namely pointwise positivity of a 
function). The theorem concerns subalgebras of B(S) containing 1, closed under 
conjugation, and closed under uniform limits. 


Theorem 4.15 (Stone Representation Theorem). Let S be a nonempty set, 
and let A be a uniformly closed subalgebra of B(S) with the properties that A 
is stable under complex conjugation and contains 1. Then there exist a compact 
Hausdorff space S,, a function p : S — S, with dense image, and a norm- 
preserving algebra isomorphism U of A onto C(S,) preserving conjugation and 
positivity, mapping 1 to 1, and having the property that U(f)(p(s)) = f(s) for 
all s in S. If S is a Hausdorff topological space and A consists of continuous 
functions, then p is continuous. 


'l An associative algebra A over C is a vector space with a C bilinear associative multiplication, 
i.e., with an operation Ax A — Asatisfying (ab)c = a(bc),a(b+c) = ab+ac, (a+b)c = ac+he, 
and a(Ac) = (Aa)c = A(ac) if A is in C and a, b,c are in A. This definition does not assume the 
existence of an identity element. 
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The idea of the proof is to consider the Banach-space dual A* and focus on those 
members of A* that are nonzero and respect multiplication—the nonzero contin- 
uous multiplicative linear functionals on A. The ones that come immediately to 
mind are the evaluations at each point: fora point s of S, the evaluation at s is given 
by es(f) = f(s), and it is a multiplicative linear functional, certainly of norm 1. 
The set S; in the theorem will be the set of all such continuous multiplicative 
linear functionals, the function p will be given by p(s) = es for s € S, and 
the mapping U will be given by U(f)(€) = €(f) for each multiplicative linear 
functional ¢. 

The Banach space A C B(S), with its multiplication, is a Banach algebra in 
the sense that it is an associative algebra over C, with or without identity, such 
that || fg] < || fllllgl| for all f and g in A. Another well-known Banach algebra 
is L'(R"). The norm in this case is the usual L! norm, and the multiplication is 
convolution, which satisfies || f * g||, < Il fll, llgll, for all f and g in Li (RY). 

The stunning idea of Gelfand’s is that the formula that defines U in the Stone 
theorem is the same formula that gives the Fourier transform in the case of L TRY). 
Specifically the nonzero multiplicative linear functionals in the case of L'(R™) 
are the evaluations at points of the Fourier transform, i.e., the mappings 
fr fv) = Fen f (x)e~?""*'Y dx These linear functionals are multiplicative 
because convolution goes into pointwise product under the Fourier transform. 

What A C B(S) and L!(R”) have in common is, in the first place, that 
they are commutative Banach algebras. In addition, each has a conjugate-linear 
mapping f +> f* that respects multiplication: complex conjugation in the case 
of A and the map f + f* with f*(x) = f(—x) in the case of L'(R). These 
conjugate-linear mappings interact well with the norm. The subalgebra A of 
B(S) satisfies 


GQ) WFAA = WSUS TI for all f, 

Ci) [I f*ll = I fll for all f, 
while L'(R”) satisfies just (ii). The theory that Gelfand developed applies best 
when both (i) and (ii) are satisfied, as is the case with A and also any L™ space, 
and it works somewhat when just (ii) holds, as with L'(R). 

Another example of a Banach algebra is the algebra B(H, H) of bounded 
linear operators from a Hilbert space H to itself, with the operator norm. The 
conjugate-linear mapping on B(H, H) is passage to the adjoint, and (i) and (ii) 
both hold. The thing that is missing is commutativity for B(H, H). However, 
if we take a single operator A and its adjoint A*, assume that A commutes with 
A*, and take the Banach algebra generated by A and A*, then we have another 
example to which the Gelfand theory applies well. The Spectral Theorem for 
bounded self-adjoint operators is the eventual consequence. 

The idea of considering the Banach subalgebra generated by A is a natural 
one because of one’s experience in the subject of modern algebra: the study of 
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all complex polynomials in a square matrix A is a useful tool in understanding a 
single linear transformation, including obtaining canonical forms for it like the 
Jordan form. Thus the use of an analogy with a topic in algebra leads one to a 
better understanding of a topic in analysis. 

In this case ideas flowed in the reverse direction as well. The multiplicative 
linear functionals correspond, by passage to their kernels, to those ideals in the 
algebra that are maximal.'* In effect the Banach algebra was being studied 
through its space of maximal ideals. About 1960, no doubt partly because of the 
success of the idea of considering the maximal ideals of a Banach algebra, the 
consideration of the totality of prime ideals of a commutative ring as a space began 
to play an important role in algebraic number theory and algebraic geometry. 


PROOF OF THEOREM 4.15. Let S; be the set of all nonzero continuous multi- 
plicative linear functionals @ on A with ¢( f) = @(f). Let us see that each such 
has norm 1. In fact, choose f with €(f) #0. Then €(f) = €(f1) = €(f)€() 
shows that €(1) = 1, and hence ||¢|| > 1. For any f with IF llsup < 1, if we had 
l€(f)| > 1, then |€(f)|" = |€(f”)| < |||] for all n would give a contradiction as 
soon as 7 is sufficiently large. We conclude that ||| < 1. 

Therefore S| is a subset of the unit ball of the Banach-space dual A*. We give 
S, the relative topology from the weak-star topology on A*. Let us define the 
function p : S — Sj, and in the process we shall have proved that S$; is not empty. 
Every s in S defines an evaluation linear functional e, in S; by es(f) = f(s), and 
the function p is defined by p(s) = es for s in S. To see that S; is a closed subset 
of the unit ball of A* in the weak-star topology, let {€,} be a net in S; converging 
to some £ € A*, the convergence being in the weak-star topology. Then we have 
la( fg) = lal flag) and €4(f) = ly(f) for all f and g in A. Passing to the 
limit, we obtain £(fg) = €(f)€(g) and &(f) = &(f). Hence S| is closed. By 
Alaoglu’s Theorem (Theorem 4.14a), S; is compact. It is Hausdorff since A* is 
Hausdorff in the weak-star topology. 

Certainly we have sup,cs les(f)| = Il fl 
we obtain 


Since any @ in S, has |/£|| < 1, 


sup* 
sup |€(f)| = If llsup- (*) 
LES) 


The definition of U : A > C(S,)isU(f)(2) = €(f), and this makes U (f)(p(s)) 
= U(f)(es) = es(f) = f(s). The function U(f) on S; is continuous by 
definition of the weak-star topology. Because of the definition of $,;, U is an 
algebra homomorphism respecting complex conjugation and mapping | to 1. 


Checking that there are no other maximal ideals than the kernels of multiplicative linear 
functionals requires proving that every complex “Banach field” is 1-dimensional, an early result in 
the subject of Banach algebras and one that uses complex analysis in its proof. Details appear in 
Section 10. 
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Also, () shows that U is an isometry. Since A is Cauchy complete, so is U(A). 
Therefore U (A) is a uniformly closed subalgebra of C(S,) stable under complex 
conjugation and containing the constants. It separates points of S| by the definition 
of equality of linear functionals. By the Stone—-Weierstrass Theorem, U(A) = 
C(S1). Since U is an isometry, U is one-one. Thus U is an algebra isomorphism 
of A onto C(S)). 

If p(S) were not dense in C (S,), then Urysohn’s Lemma would allow us to find 
a nonzero continuous function F on C(S,) with values in [0, 1] such that F is 0 
everywhere on p(S). Since U is onto C(S,), choose f € A with U(f) = F. If 
s isin S,then0 = F(p(s)) = U(f)(p(s)) = f(s). Hence all res = 0. By (*), 
£(f) = 0 for all £ € S;. Then every @ € S; hasO = €(f) = U(f)() = F(), 
and F = 0, contradiction. We conclude that p(S) is dense. 

To see that U carries functions > 0 to functions > 0, we observe first that 
the identity €(f) = @(f) for @ € S; and the equality f = f for f real together 
imply that ¢(f) = €(f) = (Cf) for f real. Hence f real implies ¢(f) real. 
If f > 0, then |[IIfllsup — Fllup < Wf llsup: Since ||¢\] < 1, we therefore have 


Lf llsup — ) S | F llsup — Flleup < UW fllsup- Since 211) = 1, this says that 
£(f) = 0. This inequality for all £ implies that U(f) > 0. 

Finally suppose that S$ is a Hausdorff topological space and that A C C(S). 
We are to show that p : S — S; is continuous. If sy — sg for a net in S, we 
want p(Sy) > p(So),1-€., es, —> s)- According to the definition of the weak-star 
topology, we are thus to show that f(sy) > f(so) for every f in A. But this is 
immediate from the continuity of f on S. 


We give three examples. A fourth example, concerning “almost periodic 
functions,’ will be considered in the problems at the end of Chapter VI. For 
this fourth example the compact Hausdorff space of Theorem 4.15 admits the 
structure of a compact group, and the representation theory of Chapter VI is 
applicable to describe the structure of the space of almost periodic functions. 

Problems 21—25 at the end of the chapter develop the theory of Theorem 4.15 


further. 


EXAMPLES. 


(1) A = C(S) with S compact Hausdorff. Then p is a homeomorphism of 
S onto S,. In fact, p(S) is always dense in S;. Here p is continuous and S is 
compact. Thus p(S) is closed and must equal S;. The map p is one-one because 
Urysohn’s Lemma produces functions taking different values at two distinct points 
s and s’ of S and thus exhibiting e, and e, as distinct linear functionals. Since p 
is continuous and one-one from a compact space onto a Hausdorff space, it is a 
homeomorphism. 

(2) One-point compactification. Let S be a locally compact Hausdorff space, 
and let A be the subalgebra of C (S) consisting of all continuous functions having 
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limits at infinity. For a function f , this condition means that there is some number 
c such that for each € > 0, some compact subset K of S has the property that 
| f(s) — c| < € for all s not in K. Then S, may be identified with the one-point 
compactification of S. 


(3) Stone-Cech compactification. Let S be a topological space, and let A = 
C(S). The resulting compact Hausdorff space S$; is called the Stone—Cech 
compactification of S. This space tends to be huge. For example, if S = 
[0, +00), the corresponding S, has cardinality greater than the cardinality of R. 


5. Linear Functionals and Convex Sets 


For this section and the next we discuss aspects of functional analysis that lead 
toward the theory of distributions and toward the use of fixed-point theorems. 
The topic is the role of convex sets in real and complex vector spaces —first 
without any topology and then with an overlay of topology consistent with convex 
sets. Sections 7-9 will then explore the consequences of this development, first 
in connection with smooth functions and then in connection with fixed-point 
theorems. 

Let X be a real or complex vector space. A subset E of X is convex if for each 
x and y in £, all points (1 — t)x + ty arein E forO <t <1. 


Proposition 4.16. Convex sets in a real or complex vector space have the 
following elementary properties: 
(a) the arbitrary intersection of convex sets is convex, 
(b) if EF is convex andx,,...,x, arein FE andt,,..., ¢, are nonnegative reals 
with tf; +---+¢, =1,then tx; +---+4,x, isin E, 
(c) if E, and E> are convex, then so are EF; + Ey, E; — En, and cE for any 
scalar c, 
(d) if L : X — Y is linear between two vector spaces with the same scalars 
and if E is aconvex subset of X, then L(£) is convex in Y, 
(e) if L : X — Y is linear between two vector spaces with the same scalars 
and if E is a convex subset of Y, then L~!(E) is convex in X. 


PROOF. Conclusions (a), (c), (d), and (e) are completely straightforward. For 
(b), we induct on n, the case n = 2 being the definition of “convex.” Suppose that 
the result is known for n and that members x1, ...,x,4, of X and nonnegative 
reals t},...,¢,4, with sum 1 are given. We may assume that t; # 1. Put 
S=hMt--+-+t4,andy=(U- ty) (texg Hee + An+1Xn+1). Since the reals 
(d—1)7'h,...,d— fy he are nonnegative and have sum 1, the inductive 
hypothesis shows that y is in E. Since t; and s are nonnegative and have sum 1, 
x1 + sy = tix, +--+ + ty4i1Xn41 isin E. This completes the induction. 
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Let E be a subset of our vector space X. We say that a point p in E is an 
internal point of F if for each x in X, there is an € > 0 such that p + 6x is in 
E for all scalars!? 5 with |4| < €. If p in X is neither an internal point of E nor 
an internal point of E°, we say that p is a bounding point of E. These notions 
make no use of any topology on X. 

Let K be a convex subset of X, and suppose that 0 is an internal point of K. 
For each x in X, let 


p(x) = inf{a > 0|a7!x € K}. 


The function p(x) is called the support function of K. For an example let X be 
a normed linear space, and let K be the unit ball; then p(x) = ||x||. 

We are going to see that p(x) has some bearing on controlling the linear 
functionals on X , as a consequence of the Hahn—Banach Theorem. By the “Hahn— 
Banach Theorem” here, we mean not the usual theorem for normed linear spaces!* 
but the more primitive statement!> from which that is derived: 


HAHN-BANACH THEOREM. Let X be a real vector space, and let p be a real- 
valued function on X with 


p(x +x’) < p(x)+ pt’) and ~~ pp(tx) = tp(x) 


for all x and x’ in X and all real t > 0. If f is a linear functional on a vector 
subspace Y of X with f(y) < p(y) for all y in Y, then there exists a linear 
functional F on X with F(y) = f(y) for all y € Y and F(x) < p(x) for all 
xex. 


Before discussing linear functionals in our present context, let us observe 
some properties of the support function p(x). Properties (b), (c), and (e) in the 
next lemma are the properties of the dominating function p in the Hahn—Banach 
Theorem as stated above. 


Lemma 4.17. Let K be a convex subset of a vector space X, and suppose 
that 0 is an internal point. Then the support function p(x) of K satisfies 


(a) p(x) = 0, 

(b) p(x) < @, 

(c) p(ax) = ap(x) fora = 0, 

(d) p(x) < 1 forallx in K, 

(e) p(x +y)< p@)+ py), 

(f) p(x) < 1 if and only if x is an internal point of K, 
(g) p(x) = | characterizes the bounding points of K. 


'5The scalars are complex numbers if X is complex, real numbers if X is real. 
'4 as in Theorem 12.13 of Basic. 
'S As in Lemma 12.14 of Basic. 


5. Linear Functionals and Convex Sets 127 


PROOF. Conclusions (a), (c), and (d) are immediate, and (b) follows since 0 is 
an internal point of K. 

For (e), let c be arbitrary with c > p(x) + p(y). We show that c7!(x + y) 
is in K. Since c is arbitrary, it follows that the infimum of all numbers d with 
d(x + y) in K is < p(x) + p(y); consequently p(x + y) will have to be 
< p(x) + p(y), and (e) will be proved. Thus write c = a+b witha > p(x) and 
b > p(y). Since K is convex, 


cTat+y)=@t+b)'@ty = Aalst Boy 


is in K, as required. 

For (f), let x be an internal point of K. Then x + «x = (1+ €)x isin K for 
some € > 0, and hence p(x) < (1+e€)7! <1. 

Conversely suppose that p(x) < 1, and put « = 1 — p(x). Fix y. Since 0 is 
an internal point of K , we can find jz > O such that dy is in K for |4| < w. If cis 
any scalar of absolute value 1, then cjy is in K, and hence p(cy) < u!. If 6 is 
a scalar with |6| < ey, write 6 = c’|d| with |c’| = 1. Then p(Sy) = |d|p(c'y) < 
|5|u~! < €. Applying (e) gives 


p(x + dy) < p(x) + p(y) = —€) + p(y) < d-e)te=1. 


By definition of p, tes dy) isin K,1.e.,x + dy isin K. Thus x is an internal 
point of K. 

For (g), we can argue in the same way as with (f) to see that p(x) > 1 
characterizes the internal points of K°. Therefore p(x) = 1 characterizes the 
bounding points of K. 


We shall now apply the Hahn—Banach Theorem to prove the basic separation 
theorem. 


Theorem 4.18. Let M and N be disjoint nonempty convex subsets of a real 
or complex vector space X , and suppose that M has an internal point. Then there 
exists a nonzero linear functional F on X such that for some real c, Re F < c 
on M andReF >conN. 


PROOF. First suppose that X is real. If m is an internal point of M, then 0 is 
an internal point of M — m, and we can replace M and N by M —m and N —m. 
Changing notation, we may assume from the outset that 0 is an internal point of 
M. 

If xo is in N, then —xo is an internal point of M — N, and 0 is an internal 
point of K = M—N+ Xo. Since M and N are assumed disjoint, M — N 
does not contain 0; thus K does not contain xg. Let p be the support function 
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of K; this function satisfies the properties of the function p in the Hahn—Banach 
Theorem, according to Lemma 4.17. Moreover, p(xo) > 1 by Lemma 4.17f. 
Define f(axo) = ap(Xxo) for all (real) scalars a. Then f is a nonzero linear 
functional on the 1-dimensional space of real multiples of xo, and it satisfies 


a>0 implies f (axo) = ap(xo) = p(axo), 


a<0 implies f (axo) = af (x0) < 0 < p(axo). 


The Hahn—Banach Theorem shows that f extends to a linear functional F on 
X with F(x) < p(x) for all x. Then F(x) > 1, and Lemma 4.17 shows that 
p(K) < 1. Hence 


F(x)>1 and F(M—N+x 9) <1. 


Thus we have F(M — N+ x9) < F(xo), F(M — N) <0, F(m —n) < 0 for all 
min M andn in N, and F(m) < F(n) for all m and n. Taking the supremum 
over m in M and the infimum over n in N gives the conclusion of the theorem 
for X real. 

Now suppose that the vector space X is complex. We can initially regard X 
as a real vector space by forgetting about complex scalars, and then the previous 
case allows us to construct a real-linear F such that F(M) < c < F(N). Put 
G(x) = F(x) —iF (ix). Since G(ix) = F(ix) —iF (x) = F(ix) —iF(—x) = 
FUx)+iF (x) =i(F(«) -—iF(ix)) =iG(x), G is complex linear. The real part 
of G equals F’,, and therefore G satisfies the conclusion of the theorem. 


6. Locally Convex Spaces 


In this section we shall apply the discussion of convex sets and linear functionals 
in the context of topological vector spaces. A topological vector space X is said 
to be locally convex if there is a base for its topology that consists of convex sets. 

Let us see that any topological vector space X whose topology is given by a 
family of seminorms || - ||; is locally convex. A base for the open sets consists 
of all finite intersections of sets U(y,s,r) = {x | lx — yll, < r} with yin X,s 
equal to one of the seminorm indices, andr > 0. If x and x’ are in U(y,5,r) 
and if 0 <+¢ < 1, then 


Il —2)x + tx’) — ylls = 1 —)@ — y) +t’ — y)lls 
<I —)@ — ys + Ie’ — ys 
= (1 —2)|lx — ylls + 2x’ — ylls 
<(U-?t)r+tr=r. 
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Hence ((1—t)x+tx' isin U(y, s,r),and U(y, s,r) is convex. Since the arbitrary 
intersection of convex sets is convex by Proposition 4.16a, every member of the 
base for the topology is convex. Thus X is locally convex. 

We are going to show that every locally convex topological vector space has 
many continuous linear functionals, enough to distinguish any two disjoint closed 
convex sets when one of them is compact. This result will in particular be 
applicable to the spaces S(R”) and C®(U) since their topologies are given by 
seminorms. 

We begin with two lemmas that do not need an assumption of local convexity 
on the topological vector space. 


Lemma 4.19. In any topological vector space if K, and K> are closed sets 
with K; compact, then the set K; — K2 of differences is closed. 


PROOF. It is simplest to use nets. Thus let x be a limit point of K; — K2, and 
let {x,} be any net in K; — K> converging to x. Since each x, is in K; — Kp, 
we can write it as x, = KD — k2 with KD in K, and 2 in Ky. Since Kk, 
is compact, {k)} has a convergent subnet, say {kay}. Let k be the limit of 
{kn} in Kj. Both {x,,} and {kn } are convergent, and {kn} must be convergent 
because ky = ky? — Xn, and subtraction is continuous. Let k2 be its limit. This 
limit has to be in K2 since K2 is closed, and then the equation x = | aay 
exhibits x as in K; — K>. Hence K, — K> is closed. 


Lemma 4.20. Let X be any topological vector space, let K; and K2 be 
disjoint convex sets, and suppose that K, has nonempty interior. Then there 
exists a nonzero continuous linear functional F on X with Re F(K,) < c and 
c < Re F(K>2) for some real number c. 


PROOF. The key observation is that any interior point of a subset E of X is 
internal. In fact, if p is in E° and x is in X, then p + 6x is in E° for 6 = 0. By 
continuity of the vector-space operations and openness of E°, p + 5x is in E° for 
|| sufficiently small. Therefore p is an internal point. 

Since K; consequently has an internal point, Theorem 4.18 produces a nonzero 
linear functional F such that 


Re F(Ki) <c and c<ReF(K2) (*) 


for some real number c. We complete the proof of the lemma by showing that F 
is continuous. Let f and g be the real and imaginary parts of F. Then g(x) = 
—if (ix), and it is enough to show that f is continuous. Fix an interior point p 
of K,, and choose an open neighborhood U of 0 such that p+ U C K,. Then 
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SU) © f(K1)—f (p) since f is real linear, and («) shows that f(U) < c—f(p). 
So f(U) <a forsomea > 0. If V= UN (—U), then 


fV) = fUN(-U)) Ss fUW)NF(-U) = fU)N(-fU)) € [-a, a], 


and therefore f(ea~!V) C€ [—e, €]. In other words, f is continuous at 0. Then 
fat ea—'V) C f(x) + [-e, €], and f is continuous everywhere. 


Theorem 4.21. Let X be a locally convex topological vector space, let K, and 
K> be disjoint closed convex subsets of X , and suppose that K; is compact. Then 
there exist € > 0, a real constant c, and a continuous linear functional F' on X 
such that 
Re F(K2) <c—e and c<ReF(K)). 


PROOF. Lemma 4.19 shows that K,; — K> is closed, and K; — K> does not 
contain 0 because K; and K2 are disjoint. Since X is locally convex, we can 
choose a convex open neighborhood U of 0 disjoint from K; — K2. Proposition 
4.16c shows that K; — K2 is convex, and Lemma 4.20 therefore applies to the 
sets U and K, — K> and yields a nonzero continuous linear functional F such 
that 

ReF(U) <d and d<ReF(K, — K2) 


for some real d. Since F is not zero, we can find xp in X with F (xo) = 1. Choose 
€ > O such that |a| < € implies axo is in U. Then 


d > Re F(U) 2 Re F ({axo | |a| < €} = (-e, ©), 
and hence d > €. Therefore all k; in K, and kp in K> have 
Re F (k,) — Re F (kz) = Re F(ki —ko) > d >, 


so that Re F(k,) > € + Re F(k2). Taking c = infk, <x, Re F(k;) now yields the 
conclusion of the theorem. 


Corollary 4.22. Let X be a locally convex topological vector space, let K be 
a closed convex subset of X, and let p be a point of X not in K.. Then there exists 
a continuous linear functional F on X such that 


sup Re F(k) < Re F(p). 
keK 


PROOF. This is the special case of Theorem 4.21 in which the given compact 
set is a singleton set. 
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Corollary 4.23. If X is a locally convex topological vector space and if p and 
q are distinct points of X, then there exists a continuous linear functional F on 
X such that F(p) # F(q). 


PROOF. This is the special case of Corollary 4.22 in which the given closed 
convex set is a singleton set. 


We conclude this section with a simple result about locally convex topological 
vector spaces that we shall need in the next section. 


Proposition 4.24. If X is a locally convex topological vector space and Y is a 
closed vector subspace, then the topological vector space X/Y is locally convex. 


REMARK. X/Y is a topological vector space by Proposition 4.4. 


PROOF. Let E be an open neighborhood of a given point of X/Y. Without loss 
of generality, we may take the given point to be the 0 coset. If g : X > X/Y is 
the quotient map, g~!(E) is an open neighborhood of 0 in X. Since X is locally 
convex, there is a convex open neighborhood U of 0 in X with U C q7!(E). The 
map q carries open sets to open sets by Proposition 4.4 and carries convex sets to 
convex sets by Proposition 4.16d, and thus g(U) is an open convex neighborhood 
of the 0 coset in X/Y contained in E. 


7. Topology on Cf,,(U) 
In this section we carry the discussion of local convexity in Sections 5—6 along the 
path toward applications to smooth functions. Our objective will be to topologize 
the space C&_(U) of smooth functions of compact support on the open set U 


of R‘. The members of C com(U) extend to functions in CS, (R”) by defining 
them to be 0 outside U, and we often make this identification without special 
comment. 

The important thing about the topology will be what it accomplishes, rather 
than what the open sets are, and we shall therefore work toward a characterization 
of the topology, together with an existence proof. The characterization will be 
in terms of a universal mapping property, and local convexity will be part of that 
property. Ultimately it is possible to give an explicit description of the open 
sets, but we leave such a description for Problem 9 at the end of the chapter. 
The explicit description will show in particular that the topology is given by an 
uncountable family of seminorms that cannot be reduced to a countable family 
except when U is empty. 

Let us state the universal mapping property informally now, so that the ingre- 
dients become clear. Let K be any compact subset of the given open set U of RY, 
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and define Cz to be the vector space of all smooth functions of compact support 
on R™ with support contained in K. The space C® becomes a locally convex 
topological vector space when we impose the countable family of seminorms 
lf lle = SUPyex |D°f (™)|, with a running over all differentiation multi-indices. 
Set-theoretically, C&,,(U) is the union of all C& as K runs through the compact 
subsets of U. The topology on C&S _(U) will be arranged so that 


com 
(i) every inclusion CP C C&.,(U) is continuous, 
(ii) whenever a linear mapping CS\,(U) — X is given into a locally convex 
linear topological space X and the composition CP > C3x,(U) > X 
is continuous for every K, then the given mapping C&.(U) — X is 


com 
continuous. 
It will automatically have the additional property 


(iii) every inclusion CY’ C CX,,(U) is a homeomorphism with its image. 


We shall proceed somewhat abstractly, so as to be able to construct the topology 
of a locally convex topological vector space out of simpler data. If (X, 7) is a 
topological space and p is in X, we define a local neighborhood base for 7 at 
p to be a collection NV, of neighborhoods of p, not necessarily open, such that if 
V is any open set containing p, then there exists N in V, with N C V. If X isa 
topological vector space with topology J and if No is a local neighborhood base 
at0,then {p+ N | N € No} isalocal neighborhood base at p because translation 
by x is a homeomorphism. A set is open if and only if it is a neighborhood of 
each of its points. Consequently we can recover J from a local neighborhood 
base No at 0 by this description: a subset V of X is open if and only if for each 
p in V, there exists Np in No such that p + Ny CV. 

Let us observe two properties of a local neighborhood base No at 0 for a 
topological vector space X. The first follows from the fact that X is Hausdorff, 
more particularly that each one-point subset of X is closed. The property is that 
for each x 4 0 in X, there is some M, in No with x not in M,. 

The second follows from the fact that 0 is an interior point of each member N 
of No. The property is that 0 is an internal point of N in the sense of Section 5. 
The fact that interior implies internal was proved in the first paragraph of the 
proof of Lemma 4.20. 


We shall show in Lemma 4.25 that we can arrange in the locally convex case 
for each member N of a local neighborhood base No at 0 to have the additional 
property of being circled in the sense that zN C N for all scalars z with |z| < 1. 

Then we shall see in Proposition 4.26 that we can formulate a tidy necessary 
and sufficient condition for a system of sets containing O in a real or complex 
vector space X to be a local neighborhood base for a topology on X that makes 
X into a locally convex topological vector space. 
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Lemma 4.25. Any locally convex topological vector space has a local neigh- 
borhood base at 0 consisting of convex circled sets. 


PROOF. It is enough to show that if V is an open neighborhood of 0, then 
there is an open subneighborhood U of 0 that is convex and circled. Since the 
underlying topological vector space is locally convex, we may assume that V 
is convex. Replacing V by V M (—V), we may assume by parts (a) and (c) of 
Proposition 4.16 that V is stable under multiplication by —1. Since V is convex, 
it follows that cV C V for any real c with |c| < 1. If the field of scalars is R, 
then the proof of the lemma is complete at this point. 

Thus suppose that the field of scalars is C. If V is a convex open neighborhood 
of 0, put 

W ={ueV|zue V forall z € C with |z| < 1}. 


Then W is convex by Proposition 4.16a, and it is circled. Let us show that 
WD $V al siV. Thus let u be an element of $V al siV, and write it as 
u= SU] = Siv2 with v, and v2 in V. Let z € C be given with |z| < 1, and let 
x and y be the real and imaginary parts of z. The vectors +v, and 0 are in V, 
and V is convex; since |x| < 1, xv; is in V. Similarly —yv2 is in V. We can 
write zu = 5 (x +iy)vy = $(xv1) + $(—yv2), and this is in V since V is convex. 
Therefore zu is in V,anduisinU. Hence W D> 5V (a) siV,as asserted. 

Let U be the interior W° of W. Then U is an open neighborhood of 0, and 
we show that it is convex and circled; this will complete the proof. Let u and v 
be in U. Since U is open, we can find an open neighborhood N of 0 such that 
u+N CU andv+WN CU. If nis in N and if t satisfies O < ¢ < 1, then 
d—fhut+tvutn=(1—-thutn)t+t(v +n) exhibits (1 —ftu+tu+nasa 
convex combination of a member of u + N C W anda member of v+ N C W, 
hence as a member of W. Therefore every member of (1 — t)u + tv -+ N lies in 
W, and U is convex. 

To see that U is circled, let u and N be as in the previous paragraph with 
u+N CU. If [z| < 1, thenu-+N C W implies zu-+ N) C W since 
W is circled. Hence zu +zN C W. Since ZN is open, zu + zN is an open 
neighborhood of zu contained in W, and we must have zu + zN C W° =U. 
Therefore U is circled. 


Proposition 4.26. Let X be a real or complex vector space. If X has a 
topology making it into a locally convex topological vector space, then X has a 
local neighborhood base No at 0 for that topology such that 

(a) each N in No is convex and circled with 0 as an internal point, 

(b) whenever M and N are in No, there is some P in No with PC MON, 
(c) whenever N is in No and a is a nonzero scalar, then aN is in No, 

(d) each x 4 0 in X has some associated M, in No such that x is notin M,. 
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Conversely if No is any family of subsets of the vector space X such that (a), (b), 
(c), and (d) hold, then there exists one and only one topology on X making X 
into a locally convex topological vector space in such a way that No is a local 
neighborhood base at 0. 


PROOF. For the direct part of the proof, Lemma 4.25 shows that there is some 
local neighborhood base at 0 consisting of convex circled sets. To such a local 
neighborhood base we are free to add any additional neighborhoods of 0. Thus 
we may take No to consist of all convex circled neighborhoods of 0. Then (b) 
and (c) hold, and (d) holds since the topology is Hausdorff. Since 0 is an internal 
point of any neighborhood of 0, (a) holds. This proves existence. 

For the converse there is only one possibility for the topology J: V is open 
if for each x in V, there is some N, in No with x + Ny C V. This proves the 
uniqueness of 7, and we are to prove existence. For existence we define open sets 
in this way and define T to be the collection of all open sets. The definition makes 
© open and the arbitrary union of open sets open, and (b) makes the intersection 
of two open sets open. 

We shall show that the complement of any {xo} is open. Then it follows by 
taking unions that X is open, so that T is a topology; also we will have proved 
that every one-point set is closed. If x; 4 xo, we use (d) to choose M,,—x, in No 
with x9 — x; not in M,,_,,. Then x; + M,,-x, GC X — {xo}. Since x; is arbitrary, 
X — {xo} is open. 

With T established as a topology, let us see that every member of No is a 
neighborhood of 0. This step involves considering the family of sets aN for 
fixed N in No and for arbitrary positive a. If 0 < t < 1 and if n, and ny 
are in N, then (1 — t)n,; + tn is in N since (a) says that N is convex. Hence 
(l1—t)N+tN CN. Ifa > Oandb > 0, then we can take t = b(a + b)~! and 
obtain a(a + b)-'N + b(a +b)“!N CN. Multiplying by a + b gives 


aN+bN C(a+b)N for all positive a and b. (*) 


In particular the sets aN are nested fora > 0,i.e.,0 <a <a’impliesaN Ca’N. 

From these facts we can show that each N in No is a neighborhood of 0. Given 
N, define U = ()o_,-, 4N. This is a subset of N by the nesting property, and 
we shall prove that it is open. If x is in U, then x is in aN for some a with 
0 <a < 1,and («) shows that x + (1 —a)N CU. By(c), 5(1 —a@)N is in No, 
and therefore $(1 —a)N can serve as amember AN, of No such that x ++ N, CU. 
We conclude that U is open. Therefore N is a neighborhood of 0. 

Next let us see that translations are homeomorphisms. If V is open and if x9 
is given, we know that each x in V has an associated N, such that x + N, CV. 
If y isin xo + V, then x = y — Xo is in V and we see that (y — x9) + Ny-x, C V 
and y + Ny-x», © x9 + V. Hence xo + V is open, and every translation is a 
homeomorphism. 
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Let us see that addition is continuous at (0, 0), and then the fact that translations 
are homeomorphisms implies that addition is continuous everywhere. If V is an 
open neighborhood of 0, then the definition of open set says that there is some 
N in No withO+N C V. By (0), 4N is in No. It is enough to prove that 
(GN ; 5N ) maps into V under addition. But this is immediate from (*) since 
SN+5NONCV. 

Next we investigate continuity of the mapping x > ax fora # 0. Itis enough 
to show that if V is open, then so is a~!V. Since V is open, every x in V has an 
associated N, in No such that x + N, C V. The most general element of a 'V 
is of the form a~!x with x in V, and we have a~!x +a7!N, C a7!V. Since (c) 
shows a7! N, to be in No, we conclude that a~!V is open. 

Let us see that scalar multiplication is continuous at (1, x), and then the fact that 
xX -» ax is continuous for a 4 0 implies that scalar multiplication is continuous 
everywhere except possibly at (0, x). Let V be an open neighborhood of x, and 
choose N in No with x + N C V. Since N is in No, (c) shows that iN is in 
No. Then 0 is an internal point of iN by (a), and there exists € > O such that 


—é€ <c < € implies that cx is in iN . There is no loss of generality in taking 
e < 1. Since iN is circled by (a), cx is in iN for |c| < €. Let A be the set of 
scalars with |a—1| < €. We show that scalar multiplication carries A x (x + iN ) 
into V. In fact, if a is in A and in is in iN, then |a| < 2, fan, is in iN, and 
(*) gives 


a(x + 4nj) = (ax x) + (x + 4an)) 5N+(¢+5N) CxX4+NCV. 


To complete the proof of continuity of scalar multiplication, we show conti- 
nuity at all points (0, x). Let V be an open neighborhood of 0 in X, and choose 
N in No withO+ N C V. Since 0 is an internal point of N, there is some € > 0 
such that cx is in N for real c with |c| < €. For this €, Sex is in 5N. If |z| < 1 
and y is in 5N, then (z, Sex + y) maps to SZEX + zy, which lies in 5N + 5N 
since N is circled. In turn, this is contained in N by (+) and therefore is contained 
in V. So (fez, x + 2e7!y) maps into V if |z| < 1 and y is in sN. Altering the 
definitions of z and y, we conclude that (z, x + y) maps into V if |z| < 5€ and y 
is in e~'N. This proves the continuity. 

Since {0} is aclosed set, Lemma 4.3 is applicable and shows that X is Hausdorff, 
hence is a topological vector space. Inside any open neighborhood V of 0 lies 
some set N in U%, and Jo_,-, aN is a convex open subneighborhood of V. 
Therefore the topology is locally convex. 

We are almost in a position to topologize CS, (U). If ix denotes the inclusion 


of Ce into CS,(U), we shall define a convex circled subset N in CS,,(U) 
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having 0 as an internal point to be in a local neighborhood base at 0 if i ze (N) is 
a neighborhood of 0 in C& for every compact subset K of U. Then conditions 
(a), (b), and (c) in Proposition 4.26 will be met, and an examination of the 
proof of that proposition shows that we obtain a topology for CS, (U) in which 
addition and scalar multiplication are continuous. What is lacking is the Hausdorff 
property, which follows once (d) holds in Proposition 4.26. Verifying (d) requires 


a construction, whose main step is given in the following lemma. 


Lemma 4.27. Let X be a locally convex topological vector space, let Y be a 
closed vector subspace, and let Y be given the relative topology, which is locally 
convex. If N is a convex circled neighborhood of 0 in Y and xo is a point in X 
not in N, then there exists a convex circled neighborhood M of 0 in X such that 
MQ Y =N and such that xo is not in M. 


My 
XO a 


0 N 


FIGURE 4.1. Extension of convex circled neighborhood of 0. 
The lemma extends N to the set given in the figure 
by M3 = R} UM, UR>. 


PROOF. Since N is a neighborhood of 0 in Y and since Y has the relative 
topology, there exists a neighborhood M, of 0 in X such that M; MN Y = U. We 
shall adjust M; to make it convex circled and to arrange that x9 is not in it. Since 
X is locally convex, we can find a convex circled neighborhood M) of 0 contained 
in M,. Taking a cue from Figure 4.1, define 


M3={U—-—t)n+tm2|neNnN, m2 € Mo, 0O<t <li}. 


This is a neighborhood of 0 since it contains M>, and it is convex circled since N 
and M) are convex circled. 
We shall prove that 
M3NY=N. 


Certainly M3 NY D> N. For the reverse inclusion let m3 be in M3 NY, and write 
m3 = (1 —t)n+tm2 withn € N,m2 € Mo, and0 <t < 1. If t = 0, then 
m3 = nisalready in N. Ift > 0, then m2 = t~!(m3 — (1 —t)n) exhibits m2 asa 
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linear combination of members of Y, hence as a member of Y. Since M> C M,, 
m2 isin M, 1 Y = N. Therefore m3 is a convex combination of the members n 
and mz of N and must lie in N since N is convex. Consequently M3; NY = N. 
If xo lies in Y, then we can take M = M3 since xo is by assumption not in N 
and cannot therefore be in the larger set M3. If xo is not in Y, then Proposition 
4.24 says that X/Y is a locally convex topological vector space. Since xo + Y is 
not the 0 coset, we can find a convex circled neighborhood P of the 0 coset that 
does not contain xo + Y. If gq : X > X/Y is the quotient map, then g~!(P) by 
Proposition 4.16e is a convex circled neighborhood of 0 in X that does not contain 
Xo and satisfies g~'(P) NY = Y. Therefore M = M3; q7!(P) is a convex 
circled neighborhood of 0 in X that does not contain xo and satisfies MAY = N. 


Proposition 4.28. Let X be a real or complex vector space, and suppose that X 
is the increasing union X = Ura X,, of asequence of locally convex topological 
vector spaces such that for each p, Xp is a closed vector subspace of X41 and 
has the relative topology. Then there exists a unique topology on X making it 
into a locally convex topological vector space in such a way that 


(a) each inclusion i, : X, — X is continuous, 

(b) whenever L : X — Y is a linear function from X into a locally convex 
topological vector space Y such that L oi, : X, — X is continuous for 
all p, then L is continuous. 


This unique topology has the property that 


(c) each inclusion i, : X, — X is ahomeomorphism with its image. 


PROOF. Let Vo be the family of all convex circled subsets N of X having 0 
as an internal point such that i, '(N) is a neighborhood of 0 in X, for all p. We 
shall prove that Vo satisfies the four conditions (a) through (d) of Proposition 
4.26, so that X has a unique topology making it into a locally convex topological 
vector space in such a way that Vo is a local neighborhood base at 0. Condition 
(a) holds by definition. Condition (b) holds because the intersection of two 
convex circled subsets with 0 as an internal point is again a convex circled set 
with 0 as an internal point and because the intersection of two neighborhoods is 
a neighborhood. Condition (c) holds because multiplication by a nonzero scalar 
sends convex circled sets with 0 as an internal point into convex circled sets 
with O as an internal point and because multiplication by a nonzero scalar sends 
neighborhoods of 0 to neighborhoods of 0. 

We have to prove (d) in Proposition 4.26, namely that each x9 4 0 in X has 
some associated M in No such that xo is notin M. Since X = Wes X py, choose 
Po as small as possible so that xo is in X,,. Since Xp, satisfies (a) through (d) and 
since x) 4 O, we can find some convex circled neighborhood M,,, of 0 in Xp, that 
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does not contain x9. Proceeding inductively by means of Lemma 4.27, we can 
find, for each p > po, aconvex circled neighborhood M, of 0 in X, that does not 
contain xg such that M, 1 Xp,-1 = Mp-1. Define M = Fees M,. Then M is 
convex circled since each M,, has this property. To see that 0 is an internal point 
of M, we argue as follows: for each x in X, x lies in some Xp, the set M, has 0 
as an internal point since M, is a neighborhood of 0, M, contains all cx for c 
real and small, and the larger set M contains all cx for c real and small. For each 
P = po, the set i e '(M) equals M,, which was constructed as a neighborhood 
of 0 in X,,. The intersection i, ‘mM ) = M, 1 X; has to be a neighborhood of 0 in 
X; fork < p since M, is a neighborhood of 0 in X,,, and the set M is therefore 
in No. Thus M meets the requirement of being a member of No that does not 
contain x9, and (d) holds in Proposition 4.26. 

We are left with proving (a) through (c) in the present proposition and with 
proving that no other topology meets these conditions. For (a), since i, is linear, 
it is enough to prove continuity at 0. Hence we are to see that if N is in No, 
then i, '(N) is a neighborhood of 0 in X,. But this is just one of the defining 
conditions for the set N to be in. No. 

For (b), since L is linear, it is enough to prove continuity at 0. Since Y is locally 
convex, the convex circled neighborhoods of 0 in Y form a local neighborhood 
base. If E is such a neighborhood, we are to show that N = L~\(E) is a 
neighborhood of 0 in X. The set FE is convex and circled with 0 as an internal 
point, and hence the same thing is true of N. Also, i,'(N) — i (BE) = 
(Loi,)~! (E) is aneighborhood of 0 in X,, since Loi, is by assumption continuous. 
Therefore N = L~!(E) is in.No, and then L~!(E) isa neighborhood of 0 in the 
topology imposed on X. Hence L is continuous at 0 and is continuous. 

For (c), we again use Lemma 4.27, except that this time we do not need a 
point xo. We are to show that if N,, is a neighborhood of 0 in X,,, then i(Np,) 
is a neighborhood of 0 in the relative topology that X defines on X,,. Since X pp 
is locally convex, there is no loss of generality in assuming that N,, is convex 
circled. Proceeding inductively for p > po, we use the lemma to construct a 
convex circled neighborhood N, of 0 in X, such that N, M1 Xp—-1 = Np—1. Put 
N=U p>p) Np- Arguing in the same way as earlier in the proof, we see that N 
is in. No. Then i(N po) = Xp) ON, and i(N,,) is exhibited as the intersection of 
Xp) with a neighborhood of 0 in X. This proves (c). 

Finally suppose that the constructed topology on X is T and that T’ is a second 
topology making X into a locally convex topological vector space in such a way 
that (a) and (b) hold. Let 17 be the identity map from (X, TJ) to (X, 7’). By 
(a) for J’, the composition 17 oi, : X, — X is continuous. By (b) for J, I 
is continuous from (X, 7) to (X, 7’). Reversing the roles of TJ and T’, we see 
that the identity map is continuous from (X, 7‘) to (X, 7). Therefore Iz is a 
homeomorphism. 
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In the terminology of abstract functional analysis, one says that X in Proposi- 
tion 4.28 is a strict inductive limit!® of the spaces X,,. With extra hypotheses that 
are satisfied in our case of interest, one says that X acquires the LF topology!” 
from the X,,’s. 

Now ie us apply the abstract theory to CS,(U). If {Kp} is any exhausting 
sequence of compact subsets of U, then we apply Proposition 4.28 with X = 

CSU) and X, = C KR For the inclusion X, C Xp+1, the restriction to C? 
of the seminorms on C ra yields the seminorms for C& - and therefore X,, hae 
the relative topology as a vector subspace of Xp4+1. The space Xp is a slved 
subspace because C is Cauchy complete and because complete subsets of a 
metric space are loved: Thus the hypotheses are satisfied, and CS, (U) acquires 


a unique topology as a locally convex topological vector space such that 

(i) each inclusion CE C C&..(U) is continuous, 

(ii) whenever a iinear’ mapping CX,,(U) — X is given into a locally convex 
linear topological space X and the composition CE —> C&,,(U) > X 
is continuous for every p, then the given mapping cx (U) > X is 
continuous. 


com 


Furthermore 


(iii) each inclusion Ck, ~ ©C& (U) is a homeomorphism with its image. 


com 
To complete our pnetiction: all we have to do is show that the resulting topology 
on C&_(U) does not depend on the choice of exhausting sequence. 

Proposition 4.29. The inductive limit topology on C&° 
the choice of exhausting sequence. Consequently 


(U) is independent of 


com 


(a) each inclusion CR C CS, (U) is a homeomorphism with its image, 


com 
(b) whenever a finger mapping CX,,(U) — X is given into a locally convex 


linear topological space X and the composition CRP — C com (U )7> xX 


is continuous for every compact subset K of U, then the given mapping 


ce. (U) — X is continuous. 


‘©The words “direct limit” mean the same thing as “inductive limit,” but “inductive” is more com- 
mon in this situation. The term “strict” refers to the fact that the successive inclusions 
ip+i,p | Xp — Xp4i are one-one with ip41,)(Xp») homeomorphic to Xp. The notion of “di- 
rect limit” is a construction in category theory that is useful within several different categories. 
Uniqueness of the direct limit up to canonical isomorphism is a formality built into the definition; 
existence depends on the particular category. For this situation the construction is taking place within 
the category of locally convex topological vector spaces (and continuous linear maps). A direct-limit 
construction within a different category plays a role in Problems 26—30 at the end of the chapter, 
and those problems are continued at the end of Chapter VI. 

«7 F” refers to “Fréchet limit.” In the usual situation the spaces X p are assumed to be locally 
convex complete metric topological vector spaces, i.c., “Fréchet spaces.” The X ,’s have this property 
in the application to Cf, (U). 
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PROOF. Write X for C3, (U) with its topology defined relative to an exhausting 
sequence { K,} of compact subsets of U, and write Y for C3, (U ) with its topology 
defined relative to an exhausting sequence {K’,}. If K; isamember of the sequence 
{Kp}, then kK, C K is for p > some index po depending on k since the interiors 
of the sets K 5 cover the compact set K,. The inclusion K, C K : is continuous 
for p > po, and therefore the composition Ky > K is — Y is continuous. This 
continuity for all & implies that the identity map from X into Y is continuous. 
Reversing the roles of X and Y, we see that the identity map is a homeomorphism. 


8. Krein—Milman Theorem 


In this section we carry the discussion of local convexity in Sections 5—6 along the 
path toward fixed-point theorems. Our objective will be to prove a fundamental 
existence theorem about “extreme points.” 

If K is a convex set in a real or complex vector space and if xo is in K , we say 
that xo is an extreme point of K if xo is not in the interior of any line segment 
belonging to K, ie., if 


xo = (1—t)x+ty with O<t<landx,yekK implies Xo=xX=y. 


Let X be a topological vector space, and let K be a closed convex subset of 
X. A nonempty closed convex subset S$ of K is called a face if whenever ¢ is a 
line segment belonging to K , in the above sense, and £ has an interior point in S, 
then the whole line segment belongs to S. With this definition, xo is an extreme 
point of K if and only if the singleton set {xo} is a face. 

If E is a subset of X, then the closed convex hull of £ is defined to be the 
intersection of all closed convex subsets of X that contain EF’. It may be described 
explicitly as the closure of the set of all convex combinations of members of E. 


Theorem 4.30 (Krein—Milman Theorem). If K is a compact convex set in a 
locally convex topological vector space, then K is the closed convex hull of the 
set of extreme points of K. In particular, if K is nonempty, then K has an extreme 
point. 


PROOF. Let X be the underlying topological vector space. We may assume, 
without loss of generality, that K is nonempty. Let us see that if f is any 
continuous linear functional on X ,, then the subset of K on which Re f assumes its 
maximum value is a face. In fact, let S be the subset where g = Ref assumes its 
maximum value m. Then S is nonempty since K is compact and g is continuous, 
and the continuity and real linearity of g imply that S is closed and convex. To 
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check that S is a face, let x9 be in S, and suppose that x9 = (1 — t)x + ty with 
0<t<1landx, yin K. Then 


m= g(xo) = 1 —t)g(x) +tg(y) < m0 —t)+tm =m. 


Equality must hold throughout, and therefore g(x) = m = g(y). Hence x and y 
are in S, and S is a face. 

Next let us see that any face of K contains an extreme point. In fact, order the 
faces by inclusion downward. The intersection of a chain of faces is nonempty 
by compactness and hence is a face that provides a lower bound for the chain. By 
Zorn’s Lemma there exists a minimal face $,. Arguing by contradiction, suppose 
that S; contains at least two points. Then Corollary 4.23 and the local convexity 
of X yield a continuous linear functional whose real part takes distinct values at 
the two points. From the previous paragraph we find that S$; contains a proper 
face S. A face of a face is a face. Thus S is a face of K strictly smaller than the 
minimal face $), and we arrive at a contradiction. 

Now we can complete the proof. If E denotes the closed convex hull of the 
set of extreme points of K, then certainly E C K. Arguing by contradiction, 
suppose that equality fails: Let x9 be in K but notin E. Then Corollary 4.22 and 
the local convexity of X produce a continuous linear functional whose real part 
has supremum on E strictly less than the value at x9. The first paragraph of the 
proof shows that the subset of K where the real part of this linear functional takes 
the value at xo is a face of K , and the second paragraph shows that this face has 
an extreme point. This extreme point is not in E, and we arrive at a contradiction. 


Compact convex subsets of R% arise in practical maximum-minimum prob- 
lems involving several variables, typically economic variables. Often the compact 
convex set is a polyhedron, and the function to be maximized is the sum of a 
constant and a linear function. The Krein—Milman Theorem produces extreme 
points, and the basic techniques of the subject of linear programming show that 
the maximum is attained at an extreme point and show how to find this extreme 
point. 

A natural place where infinite-dimensional compact convex sets arise is in the 
weak-star topology on the closed unit ball of the dual of a normed linear space. 
Alaoglu’s Theorem says that this set is compact, and it is certainly convex. The 
Hahn—Banach Theorem is what shows that this compact convex set contains a 
nonzero element when the normed linear space is nonzero. 

When the whole closed unit ball is the set of interest, let us see what the 
extreme points are like in certain situations. If the underlying normed linear 
space is a Hilbert space, then the real part of a continuous linear functional takes 
its maximum value at a single point of the closed unit ball. The upshot of this 
fact is that the proof of the Krein—Milman Theorem above degenerates; Zorn’s 
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Lemma is not needed, for example, to produce an extreme point. The proof 
degenerates in the same way, in fact, whenever one considers some L? space 
withhl <p<o. 

The case of L™ is more interesting. Let us work with real-valued functions 
in the context of a o-finite measure space, regarding L~ as the dual of L'. The 
extreme points of the closed unit ball are all the L° functions that take only the 
values —1 and +1. 

Similarly we can consider the space C ([0, 1]) of continuous functions on [0, 1]. 
Again let us work with real-valued functions. Suppose that this Banach space 
is the dual of some normed linear space. Then the closed unit ball of C([0, 1]) 
forms a compact convex set in the weak-star topology. As with L®, the extreme 
points are the functions that take only the values —1 and +1. The functions have 
to be continuous, however, and they are therefore constant. So we get only two 
extreme points, the constant functions —1 and +1, and their closed convex hull 
contains only constant functions. The conclusion is that C ({0, 1]) is not the dual 
of any normed linear space. 

We can argue similarly with measures and L! functions. Suppose that X is 
a compact Hausdorff space. The Banach space M(X) of regular complex Borel 
measures on X is the dual of C(X), and the set of nonnegative Borel measures 
of total mass < 1 is a closed compact subset of the unit ball in the weak-star 
topology. This set has to be the closed convex hull of its extreme points. Indeed, 
as is pointed out in Problem 17 at the end of the chapter, the extreme points of 
this set are 0 and the point masses of mass | at the points of X; the statement of 
the theorem is reflected in the fact that any regular Borel measure on X with total 
mass < | is a weak-star limit of linear combinations of point masses. 

We can consider similarly the space L'({0, 1]) of Borel functions on [0, 1] 
integrable with respect to Lebesgue measure. Suppose that this Banach space is 
the dual of some normed linear space. Then the closed unit ball of L'({0, 1]) 
forms a compact convex set in the weak-star topology. Problem 18 at the end of 
the chapter shows that the extreme points are trying to be the functions whose 
mass is concentrated at a single point, and there are none. The conclusion is that 
L'({0, 1]) is not the dual of any normed linear space. 

The Krein—Milman Theorem begins to show its power when applied to more 
subtle closed convex subsets of a unit ball in the weak-star topology. Here is 
an example that lies behind the foundations of the theory of locally compact 
abelian groups.'® For concreteness we work with complex-valued functions on 
the integers, i.e., doubly infinite sequences. Such a function f(v) is said to be 
positive definite if )),,c(A FU — k)c(k) > 0 for all functions c(n) on the 
integers with finite support. Positive definite functions are easily checked to 


'8Such groups are defined in Chapter VI. 
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have f (0) > 0 and | f(m)| < f(O). In particular, the set K of positive definite 
functions f with f(0) = 1 may be regarded as a subset of the closed unit ball 
of L™ of the integers with the counting measure, a space sometimes called °°. 
Weak-star convergence for such functions is the same as pointwise convergence, 
and it follows that K is closed, hence compact. Checking the definition, we see 
that K is convex. The Krein—Milman Theorem tells us that K is the closed convex 
hull of its extreme points. It is shown in Problem 20 at the end of the chapter that 
the extreme points are the functions fg(n) = e’”® for real 0. 

By way of introduction to the next section, let us consider one more example. 
Let S be a compact Hausdorff space, and let F be any homeomorphism of S. Put 
X = C(S). In the weak-star topology on M(S), the nonnegative regular Borel 
measures yz with 4(S) = 1 form a compact convex subset K, of M(S). The 
Markov—Kakutani Theorem in the next section shows that there exist elements of 
K, invariant under F. The invariant such measures therefore form a nonempty 
compact convex subset K of K;. According to the Krein—Milman Theorem, K is 
the closed convex hull of its set of extreme points. As shown in Problem 19 at the 
end of the chapter, the j’s that are extreme points have the interesting property 
that all Borel subsets that are carried onto themselves by the homeomorphism F 
have measure 0 or 1; the usual name for this phenomenon is that jz is ergodic with 
respect to F. Since the Krein—Milman Theorem is saying that extreme points 
exist, we obtain the consequence that for each homeomorphism F of S, there is 
some regular Borel measure jz with w(S) = 1 that is ergodic with respect to F. 


9. Fixed-Point Theorems 


In this section we continue the discussion of convexity and local convexity. We 
shall give two fixed-point theorems. 


Theorem 4.31 (Markov—Kakutani Theorem). Let K be a compact convex set 
in a topological vector space X, and let F be a commuting family of continuous 
linear mappings carrying K into itself. Then there exists a point p in K such that 
T(p) = p for all T in F. 


PROOF. For each integer n > 1 and member T of F, let 


Tn = GU 4+T 477? 4-47"), 


Let K be the family of all subsets of X that arise as T,(K) for some n > 1 and 
some T in F. Each such set is a compact convex subset of K , being the image 
of a compact convex set under a continuous linear mapping that carries K into 
itself. If Wes Vaan is a finite subset of F and each n; is > 1, then 


OLS tn T?(K) S SBS ied te T">(K) eis TY (K). 
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By symmetry and commutativity of the operators, 
LOT eT OUR Sis TUR): 


Thus the members of K have the finite-intersection property. By compactness 
their intersection is nonempty. Let p be in the intersection. We shall show that 
T(p) = p for all T in F. 

Arguing by contradiction, suppose that T is given in F with T(p) 4 p. Choose 
a neighborhood U of 0 in X such that T(p) — p is notin U. The fact that p is in 
the intersection of all the sets in K implies that p is in T,(K) for n > 1 and thus 


pa=n'C4+T+T? 4+---+T" Gn) 
for some g, in K. Applying T — J to this equality, we obtain 
T(p) — p=n'(T" ~ 1)(4n)- 


Since the left side is not in U, the right side is notin U. Since T”(q,) and qn are 
in K, it follows that 1(K — K) is not contained in U for any n. But K — K isa 
compact set, being the image under the subtraction mapping of the compact set 
K x K, and this conclusion contradicts Lemma 4.7. 


Let us return to the example at the end of the previous section. As in that 
example, let S be a compact Hausdorff space, and let F be any homeomorphism 
of S. Put X = C(S). In the weak-star topology on M(S), the nonnegative regular 
Borel measures jz with (5) = | form a compact convex subset K; of M(S). 
The homeomorphism F acts on M(S) by the formula Tr(p)(E) = p(F-!(E)). 
The mapping Tr is linear, and it follows from the definitions that 7 satisfies 
ITF Il sy = lPllancsy: Thus 7, has norm 1 and is continuous. It maps K, 
into itself. Putting F = {7} and applying Theorem 4.31, we obtain the existence 
of a nonzero F invariant measure on S. The discussion in the previous section 
went on to observe that the subset K of F invariant measures in K,, which we 
now know to be nonempty, is compact convex in a locally convex topological 
vector space. Thus K is a set to which we can apply the Krein—Milman Theorem, 
and the extreme points turn out to be the ergodic invariant measures. 


Theorem 4.32 (Schauder—Tychonoff Theorem). Let K be a compact convex 
set in a locally convex topological vector space, and let F be a continuous function 
from K into itself. Then there exists p in K with F(p) = p. 
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The proof of Theorem 4.32 is long and will be omitted.!? The power in the 
result comes from its applicability to nonlinear mappings. In the special case 
in which K is the closed unit ball in R”, it reduces to the celebrated Brouwer 
Fixed-Point Theorem. 

This kind of theorem has applications to economics, where fixed-point theo- 
rems prove the existence of equilibrium points for certain systems. The theorem 
does not by itself address stability of such an equilibrium point, however. 

By way of illustration, let us return to a comparatively simple situation that was 
studied in Chapter IV of Basic. The usual Picard—Lindelof Existence Theorem”? 
for the initial-value problem with a system y’ = f(t, y) of ordinary differential 
equations assumes continuity of f and also a Lipschitz condition for f in the 
y variable. A variant, the Cauchy—Peano Existence Theorem, is the subject of 
problems at the end of Chapter IV of Basic. It assumes only continuity for f and 
obtains existence of solutions, with uniqueness being lost. The Cauchy—Peano 
result is proved using Ascoli’s Theorem and a nonobvious construction. 

Ascoli’s Theorem, as we know from Section X.9 of Basic, is intimately con- 
nected with compactness. Let us see how to combine Ascoli’s Theorem and the 
Schauder—Tychonoff Theorem to obtain a more transparent proof of the Cauchy— 
Peano result than was suggested in the problems at the end of Chapter IV of Basic. 
To keep the notation simple, we stick with the case of a single equation, rather 
than a system. We suppose that f(t, y) is continuous on an open subset D of R?. 
Let (to, yo) be in D, and let R be a closed rectangle in D centered at (fo, yo) and 
having the form 


R= {(t, y)| lt — tol < a and |y — yo| < dD}. 


Suppose that | f(t, y)| < Mon R. Put a’ = min{a, +}. The theorem is that 
there exists a continuously differentiable solution y(t) to the initial-value problem 
y' = f(t, y), yo) = yo, It — tol < a’. 

For the proof let X be the Banach space C ({t | |t —to| < a’}), and let K be the 
closure of the set 


(i) y(to) = yo, 
(ii) y’ is continuous for |t — fo| < a’, 
(iii) |y’(@)| < M for |t — | <a’ 


b= [rex 


in the Banach space X. Condition (iii) makes EF an equicontinuous family, and 
(i) and (iii) together make F pointwise bounded. Lemma 10.47 of Basic shows 
that the closure K is equicontinuous and pointwise bounded. Ascoli’s Theorem 


!9 proof may be found in Dunford—Schwartz’s Linear Operators, Part I, pp. 453-456 and 
467-469. 
0Theorem 4.1 of Basic. 
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therefore shows that K is compact. Define a function F carrying the space K of 
functions to another space of functions by 


FIO = yot fi f(s, y(s)) ds. 


For y in E, we have |y(s) — yo| < M|s — to| < Ma’ < b, and thus (s, y(s)) 
is in the rectangle R. Hence F'(y) satisfies (i), (ii), and (iii) and is in E. So F 
carries E to itself. The formula for F makes clear that F extends to a continuous 
mapping on K in the supremum-norm topology. Since F(E) C E, we obtain 
F(K) C K. The set K is compact convex in a Banach space, which is locally 
convex. The Schauder—Tychonoff Theorem applies to F’, and the fixed point it 
produces is the desired solution. 


10. Gelfand Transform for Commutative C* Algebras 


Alaoglu’s Theorem, obtained in Section 3, leads in several directions in functional 
analysis, and we now return to its ramifications for spectral theory. The Stone 
Representation Theorem in Section 4 gave a concrete example of what we shall 
be investigating, showing that certain subalgebras of the algebra B(S) of all 
complex-valued bounded functions on a set S can be realized as the algebra of 
all complex-valued continuous functions on a suitable compact Hausdorff space. 
The present section is devoted to a generalization due to I. M. Gelfand of this result 
to certain algebras besides B(S); a different special case of this generalization will 
yield in the next section the Spectral Theorem for bounded self-adjoint operators 
on a Hilbert space. 

Recall from Section 4 that a complex Banach algebra A is a complex as- 
sociative algebra having a norm that makes it into a Banach space such that 
|ab|| < |la||||b|| for all a and b in A. We shall not consider A = 0 as a Banach 
algebra. Nor shall we have any occasion to consider real Banach algebras. The 
inequality concerning the norm under multiplication implies that multiplication 
is continuous. If the Banach algebra has an identity, the same inequality implies 
that ||1 || > 1. 


EXAMPLES. 


(1) If S isa nonempty set, then the algebra B(S) of all bounded complex-valued 
functions on S$ is a commutative Banach algebra. The function 1 is an identity. 
If S has a topology, then the subalgebra CS) of bounded continuous functions 
gives another example of a commutative Banach algebra with identity. 

(2) If (S, 2) is a o-finite measure space, then pointwise multiplication and the 
essential-supremum norm make L°(S, jz) into a commutative Banach algebra 
with identity. 
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(3) In Euclidean space R%, the Banach space L'(IR”) with Lebesgue mea- 
sure becomes a commutative Banach algebra with convolution as multiplication: 
(f *g)(x) = fow fe—y)g(y) dy = fon f(y)g(x—y) dy. This Banach algebra 
does not have an identity. A variant of this Banach algebra may be defined using 
functions on R* periodic in each variable with period 27, the measure being 
(2)~% dx, and convolution being the multiplication. Still another variant uses 
functions on Z integrable with respect to the counting measure, and convolution 
is again the multiplication. 


4) If H is acomplex Hilbert space, then the algebra B(H, H) of all bounded 
linear operators from H to itself is a Banach algebra with identity when the norm 
is the operator norm and the multiplication is composition of operators. 


The example of L! is so important that one does not want automatically to 
impose a condition on a Banach algebra that it contain an identity. Nevertheless, 
it is always possible to adjoin an identity to a Banach algebra if one wants, as the 
following proposition shows. 


Proposition 4.33. Let A be a complex Banach algebra, and let 
B={(a,d)|aisinAandidisinC}=A@C 


as a vector space. Define 


(a, A)(b, “) = (ab + db + ma, AM) 
and l(a, A)|] = llal] + IAI. 


Then B is acomplex Banach algebra with identity (0, 1), and the mapa + (a, 0) 
is a norm-preserving algebra homomorphism of A onto a closed ideal in B. 


REMARKS. The formula for the multiplication is motivated by expansion of 
the product (a + A)(b + yw), and the formula for the norm is motivated by the 
norm of the element f dx + 59 in M(R™), where 4g is a point mass of weight 1 
at the origin. We omit the proof of the proposition, since we shall not pursue L! 
very far from this point of view. 


To proceed further, let us go back to our examples and see what can be said 
about them. For B(S) in Example 1, the Stone Representation Theorem realized 
certain subalgebras as C(X) for some compact Hausdorff space X. The space X 
is the space of all nonzero continuous multiplicative linear functionals respecting 
complex conjugation, regarded as a closed subset of the set of all continuous linear 
functionals of norm < 1 with the weak-star topology. Evaluations at points of $ 
provide examples of members of X, and X is just the closure of those evaluations. 
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To what extent might multiplicative linear functionals help us understand 
the other examples? For L° in Example 2, the notion of multiplicative linear 
functional is meaningful, but it is not clear that any nonzero ones exist. At points 
of the measure space of positive measure, evaluations are well defined and yield 
multiplicative linear functionals. But if every one-point set of the measure space 
has measure 0, then it is not clear how to proceed. 

For L! in Example 3, the answer is more decisive. The most general con- 
tinuous linear functional is integration with an L® function, and the nonzero 
continuous multiplicative linear functionals are the ones where the L™ function 
is an exponential x +> e’*” for some y in R™. Let us sketch the argument. If a 
multiplicative linear functional @ is given by the nonzero L™ function ¢, then the 
condition ¢(f « g) = €(f)€(g) translates into the condition 


if f(x)g(y)e(x + y) dx dy = / f(x)g(y)e(x) p(y) dx dy. 
RY xR RY xR 


x 


Since f and g are arbitrary, g(x + y) = g(x) ¢Q) ae. [dx dy]. Letting p be in 
Ccom(R* ) and integrating this equation with p(y) gives 


ie P(y)o(x + y) dy = g(x) E P(y)e(y)dy ae. [dx]. 


The left side, upon the change of variables y +» —y, is the convolution of a 
function in Ceom(R”) and a function in L~(R”). It is therefore continuous 
as a function of x. On the right side some p has Jpn P(y)e(y) dy 4 0 since 
gy is not the 0 function almost everywhere. Fixing such a p and dividing by 
Jpn P()e(y) dy, we see that g(x) is almost everywhere equal to a certain 
continuous function. We may therefore adjust g on a set of measure 0 to be 
continuous. Once adjusted, ¢ satisfies p(x + y) = g(x)g() everywhere. It is 
then a simple matter to see that g is an exponential, as asserted. 

Example 4 is something like Example 2. Suppose that A is a bounded self- 
adjoint operator on the Hilbert space H. We can form the smallest subalgebra 
of B(H, H) containing A and the identity, and we can look for multiplicative 
linear functionals. Theorem 2.3 addresses a situation in which we can identify 
such functionals. If A is compact, then the theorem gives an orthonormal basis 
of eigenvectors, and every member of this algebra acts as a scalar on each eigen- 
vector. So each eigenvector yields, via the corresponding set of eigenvalues, a 
multiplicative linear functional. If A is not compact, however, eigenvectors need 
not exist, and then it is unclear where to look to find nonzero multiplicative linear 
functionals. 

A series of theoretical insights now comes into play. An associative algebra 
with identity need not have nonzero multiplicative linear functionals, but it always 
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has maximal ideals. These come from Zorn’s Lemma, the proper ideals being 
those ideals not containing the identity. Accordingly, we shall think in terms of 
maximal ideals. These turn out to be closed, because as we shall see, there is a 
neighborhood of the identity where every element is invertible with an inverse 
given by the sum of a geometric series. The quotient of a commutative complex 
Banach algebra with identity by a (closed) maximal ideal is then a complex 
Banach algebra in which every nonzero element is invertible. The remarkable 
fact is that such a quotient necessarily is 1-dimensional. Then it follows that 
the maximal ideals all correspond to continuous multiplicative linear functionals 
after all, and their existence has been established. Let us run through the steps. 

Let A be a Banach algebra with identity, at first not necessarily commutative. 
If a is in A, then a right inverse to a is an element b with ab = 1. If a has a right 
inverse b and if a has a left inverse c, then the two are equal as a consequence 
of the associativity of multiplication: c = cl = c(ab) = (ca)b = 1b = b. So 
a has a two-sided inverse, which we call simply an inverse, and we say that a is 
invertible. 


Proposition 4.34. Let A be a Banach algebra with identity. If ||a|| < 1, then 
1 — a is invertible and ||(1 — a)~!|| < (1 — la|j)7!. 


PROOF. Form er a". This series is Cauchy since ||a”|| < ||a||” implies 
| ya" | < ey lal” < lal —llal|)“!. Since A is complete, the series 
y°*)a" is convergent. Let b be its sum. Then we have (1 — a)(7_)a") = 
(oN) a") — a) = 1 —a*!, and hence (1 — a)b = b(1 — a) = 1. Also, 
Bll < Yep Hall” = G = lal). 


Corollary 4.35. In a Banach algebra with identity, the invertible elements 
form an open set. More particularly if ||a|| is invertible and ||x — a|| < |lja~'||7', 
then x is invertible. 


PROOF. Let U be the set of invertible elements, and let a be in U. If ||x —a|| < 
\|a~!||~', then 


1 at att 
lax — 1 = lla @ — a) || S lla [Ile — a] < I, 


1 


and Proposition 4.34 shows that 1 — (1 — a~'x) = a7'x is invertible. Hence x 


is invertible. 


Proposition 4.36. If A is a Banach algebra with identity and U is the open set 
of invertible elements, then inversion is a continuous map of U into itself. 
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PRooF. Let a be in U, and let ||x — al] < ||a~'||7', so that x is in U by 
Corollary 4.35. Then 


at ad =A 21 4) i541 
—a || = |x @ —a)a™ || S lla [lx Ila — all, 


I|x 
and continuity will follow if we show that ||x~!|| < M < oo for x near a. 
Computation and Proposition 4.34 give 


= ces ne UE oo Ap eet tin hg. calls SG lla7"|| 
|x" || = (a- (a—x)) || = lla - d —xa™)) IS exe’ 


and the desired boundedness follows from continuity of multiplication. 


Let A be a complex Banach algebra with identity. If a is in A, the spectrum 
of a is the set 
a(a) = {A € C | a — Ais not invertible}. 


It will be proved in Corollary 4.39 below that o(a) is always nonempty. The 
resolvent set P(a) of a is the complement of o(a) in C. The resolvent of a is 
the function 

R(A) = (a—A)7! from P(a) into A. 


The spectral radius of a, denoted by r(a), is 
r(a) = sup {|A| | A is in o(a)}. 


Proposition 4.37. For a in a complex Banach algebra A with identity, o (a) 
is compact and r(a) is < |la|l. 


PROOF. The function 4 +» a — d is continuous, and the set U of invertible 
elements is open, the latter by Corollary 4.35. Thus P(a) = {A |a—AisinU} 
is open. Hence the complement o(a) is closed. Fix A with A > |la||. Then 
|A~!a]| < 1, and therefore A~'a — 1 is in U. Since A 4 0,a—AisinU. 
Thus 4 is in P(a). It follows that o (a) is contained in {r |A| < lla} and that 
r(a) < |la||. Since o (a) is then bounded, as well as closed, o (a) is compact. 


We say that a function g from an open subset V of C into the complex Banach 
algebra A is weakly analytic on V if £0 g is an analytic function on V for every 
£ in the dual space A*. 


Theorem 4.38. If A is a complex Banach algebra with identity and if a is in 
A, then R(A) = (a —A)7! is weakly analytic on P (a) with limy_, 0 ||R(A)|| = 0. 
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PROOF. Let Ag be in P(a), and let £ be in A*. Writing 
a —2 = (a ~do)(1 — (a — Ao)! (A — Av) 


and applying Proposition 4.34, we see that a — A is invertible if the condition 
| (a — Ao) ~!(A — Ao) || < 1 is satisfied. In this case, 


(aA) = (a do) | eg (€ = 0) "(A — Ao)”, 


and the continuity of @ yields 


CO 


e((a — Ay!) = D7 e(a = doy) = Ao)", 


n=0 


with the series convergent. Therefore R(A) is weakly analytic. 
To establish that lim,_. <9 ||(a — 4)7~!|| = 0, we write 


(a-ayt=(ata—p) lb =atata- yt 
Proposition 4.34 gives 
Jaa -D) PS lar ay, 
and the right side tends to 1 as A tends to infinity. Hence lim,_, 55 ||(a—A)7!|| = 0. 


Corollary 4.39. If A is a complex Banach algebra with identity, then o (a) is 
nonempty for each a in A. 


Proor. If o(a) were to be empty, then every @ in A* would have A +> 
€((a — A)~') entire and vanishing at infinity, by Theorem 4.38. By Liouville’s 
Theorem, we would have £((a —A)~!) = 0 for every a and A. Since ¢ is arbitrary, 
the Hahn—Banach Theorem would give (a — 4)~! = 0, contradiction. 


Corollary 4.40 (Gelfand—Mazur Theorem). The only complex Banach algebra 
with identity in which every nonzero element is invertible is C itself. 


PROOF. Suppose that A is a complex Banach algebra with identity with every 
nonzero element invertible. If a is given in A, o(a) is not empty, according to 
Corollary 4.39. Choose 4 in o(a). Then a — A is not invertible. Since every 
nonzero element of A is by assumption invertible, a — A = 0. Hencea = i. 
Thus A consists of the scalar multiples of the identity. 
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Corollary 4.41. If A is a commutative complex Banach algebra with iden- 
tity, then the nonzero multiplicative linear functionals on A stand in one-one 
correspondence with the maximal ideals of A, the correspondence being 


= | multiplicative 


: ; —>_ ker £ = maximal ideal 
linear functional 


with inverse 


maximal ideal, 
I= | necessarily vi] — € defined by €(x,A) =). 
A=I1@6Cl 


Every nonzero multiplicative linear functional is continuous with norm < 1, and 
every maximal ideal is closed. Every nonzero multiplicative linear functional 
carries 1 into 1. 


REMARKS. The proof will make use of Problem 4 in Chapter XII of Basic: 
if X is a Banach space and Y is a closed subspace, then the vector space X/Y 
becomes a normed linear space under the definition ||x + Y || = infyey ||x + yl], 
and the resulting metric on X/Y is complete. Problem 1 at the end of the present 
chapter points out that the Banach space X/Y obtained this way has the same 
topology as the quotient topological vector space X/Y defined in Section 1. 


PROOF. We may assume A # 0. If @ is a nonzero multiplicative linear 
functional, then its kernel is an ideal of codimension |, hence is a maximal ideal. 
Conversely if J is a maximal ideal, then no element of J can be invertible. Since 
the set U of invertible elements is open, according to Corollary 4.35, the set J 
is at positive distance from 1. Thus the closure / cl which is an ideal, does not 
contain 1. Since J is maximal, /°! = J. Thus / is closed. By the above remarks, 
A/T is a complex Banach space. Its multiplication makes it into a complex 
Banach algebra because if we take the infimum over y; € J and y2 € J of the 
right side of the inequality 


la1a2 + T]| < |layan + (a2 + ay y2 + y1y2)I| 
=a + yi)(a2 + y2)ll 
S lai + yillila2 + yall, 


we obtain ||aja2+J || < |la;+/|||la2+/||. The quotient A/J is also a field, being 
the quotient of a nonzero commutative ring with identity by a maximal ideal. By 
Corollary 4.40, A/I = C. Hence J has codimension 1, and A = J ®C1 as vector 
spaces. If we define a linear functional ¢ by (x, 4) = A, then we readily check 
that £ is multiplicative and has kernel J. To see that £ is continuous, one way to 
proceed is to use the Hahn—Banach Theorem: Since / is closed and 1 is not in J, 
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there exists a continuous linear functional ¢’ with @’(1) 4 0 and €’(7) = 0. Then 
€ = ¢'(1)~'€(1)é’, and therefore £ is continuous. 

This establishes the correspondence. To check that it is one-one, it is enough 
to see that any nonzero multiplicative linear functional carries 1 into 1. If € is 
a nonzero multiplicative linear functional, then (a) = €(a)£(1) = C(a)£(1). If 
we choose a with £(a) 4 0, then we can divide and conclude that €(1) = 1. 

Finally we check the norm of the nonzero multiplicative linear functional £. 
If a in A has |la|| < 1, then |€(a)|" = |€@")| < [lélllla"ll < llélllail” < Well. 
Since n > 1 is arbitrary, we must have |€(a)| < 1. Taking the supremum over a, 
we obtain ||¢|| < 1. 


If A is a commutative complex Banach algebra with identity, we denote its 
space of maximal ideals by A*,. For A # 0, this space is nonempty by an 
application of Zorn’s Lemma to the set of all proper ideals of A. Using the 
identification via Corollary 4.41 of A%, as a set of linear functionals of norm < 1, 
we can regard A* as a subset of the unit ball of the dual A*. We give A*, the 
relative topology from the weak-star topology on A*. 


Proposition 4.42. If A is a commutative complex Banach algebra with 
identity, then the weak-star topology makes the maximal ideal space .A*, into 
a compact Hausdorff space. 


PROOF. Corollary 4.41 identifies A*, with a subset of the unit ball of A*, which 
is compact in the weak-star topology by Alaoglu’s Theorem (Theorem 4.14) and 
is also Hausdorff. All we have to do is show that A*, is a closed subset. For each 
a and b in A, the set {€ € A* | €(ab) = £(a)£(b)} is closed since the functions 
£r> &(ab) and £ + €(a)£(b) are continuous from the weak-star topology into 
C. Hence the intersection over all a and b is closed. The set A®, is the intersection 
of this set with the closed set {€ € A* | €(1) = 1} and is therefore closed. 


For L! or any other complex Banach algebra A not containing an identity, the 
prescription for applying the above theory to A is to adjoin an identity and form 
A@®C, apply the results to A @ C, and then see what happens when the identity 
is removed. For Proposition 4.42, A is one of the maximal ideals in A @ C. 
Removing it from (A @ C)* yields a locally compact Hausdorff space whose 
one-point compactification is (A @ C)*. 

It is now just a formality to obtain a mapping of any commutative com- 
plex Banach algebra A with identity into C(A%,). The Gelfand transform 
a +> Gis the mapping of A into C(.A*,) given by a(¢) = €(a) for each nonzero 
multiplicative linear functional £ on A. 

In the context of a suitable subalgebra of B(S), the Gelfand transform is just 
the evaluation of all nonzero multiplicative linear functionals on the members of 
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the subalgebra. Such linear functionals turn out automatically to respect complex 
conjugation.”! The evaluations at the points of S are a dense subset of these. 
The Stone Representation Theorem says that the Gelfand transform is a norm- 
preserving algebra isomorphism. 

In the context of L'(R%), the Gelfand transform is just the Fourier trans- 
form. The nonzero multiplicative linear functionals are the functions ¢y(f) = 
Jari We"? dx for ye RY 16, Of) = f(y). The Gelfand transform is 
the mapping of f to the resulting function of ¢, or of y. It is therefore exactly 
the Fourier transform f t» f if we parametrize L!(R% );, by the variable y. 

The Gelfand transform makes sense for our other two examples as well, for 
L° and for the complex Banach algebra generated by the identity and a single 
self-adjoint bounded linear operator on a Hilbert space. But we do not so far 
get much insight into what the Gelfand transform does for these cases. We can 
summarize all the formalism as follows. 


Proposition 4.43. If Ais acommutative complex Banach algebra with identity, 
then the Gelfand transform is an algebra homomorphism of norm < 1 of A into 
C(A%,) carrying | to 1, and its kernel is the intersection of all maximal ideals of 
A. Moreover, for each a and b in A, 


(a) o(a) is the image of the function @ in C, 
(b) r(@) = [lal up» 
(c) r(a+b) <r(a)+r(b) andr(ab) <r(a)r(b). 


PROOF. The Gelfand transform is an algebra homomorphism because 
ab (€) = €(ab) = L(a)e(b) = a(£)b(L) 


forall in A® . Corollary 4.41 shows that each ¢ in A* has norm < 1, and therefore 
la(£)| = |€(a)| < |la||. Hence I sup < |la||, and the Gelfand transform has norm 
< 1. Corollary 4.41 shows that every nonzero multiplicative linear functional 
carries | into 1, and therefore the Gelfand transform carries | into 1. 

The kernel of the Gelfand transform is the set of all a in A with a@(£) = 0 for 
all 2, thus the set of all a with €(a) = 0 for all @, thus the intersection of the 
kernels of all €’s. 

For (a), we observe that a is invertible if and only if aA = .A, if and only if a is 
not in any maximal ideal, if and only if @ is nowhere vanishing. Thus a complex 
number A is in o (a) if and only if a — A is not invertible, if and only if @ — A is 
somewhere vanishing, if and only if A is in the image of @. This proves (a). 


?|The verification for an algebra as in Theorem 4.15 that the nonzero multiplicative linear 
functionals automatically respect complex conjugation is embedded in the proof of Theorem 4.48 
below. See the paragraph of the proof containing the display (+) and the two paragraphs that follow 
it. 
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Conclusion (b) is immediate from (a) and the definition of r (a), and (c) follows 
from (b) and the inequalities satisfied by the supremum norm. This completes 
the proof. 


Proposition 4.43 isolates the real problem, which is to say something quanti- 
tative about the intersection of the kernels of all maximal ideals, about o (a), and 
about r (a). For our purposes it will be enough to have the spectral radius formula 
that is proved in Corollary 4.46 below. 


Theorem 4.44 (Spectral Mapping Theorem). If A is acomplex Banach algebra 
with identity, if a is in A, and if Q is any polynomial in one variable, then 
QO(o(a)) =0(Q@)). 

REMARKS. The left side Q(o (a)) is understood to be the image under Q of the 
set o (a), while the right side o (Q(a)) is the spectrum of Q(a),i.e., the spectrum 
of the member of A obtained by substituting a for the variable in Q. 


PROOF. First we show that O(o(a)) € o(Q(a)). Let Ao be in o (a), so that 
a — Xo is not invertible. Arguing by contradiction, suppose that Q(a) — Q(Ao) 
is invertible, say with b as two-sided inverse. Let S be the polynomial 
defined by Q(A) — Q(Ao) = (A — Ao) S(A). Since b is a two-sided inverse of 
Q(a) — Q(Ao) = (a — Ao) S(a), we have 1 = b(a — 0) S(a) = (bS(a))(a — do) 
and | = (a—Ao)(S(a)b). Thus a — Ao has a left inverse bS(a) and a right inverse 
S(a)b, and a — Ao must be invertible, contradiction. 

For the reverse inclusion 0 (Q(a)) € Q(o(a)), suppose that Ag is ino (Q(a)). 
Let A1,..., A, be the roots of Q (A) — Ag repeated according to their multiplicities. 
Then we have Q(A) — Ap = c(A — A1) +++ (A — A,) for some nonzero constant c. 
Substitution of a for 4 gives 


Q(a) — Ao = cla — At) ++ (@~ An). 
Since Q(a) — Ao is by assumption not invertible, some a — A; is not invertible. 
For this j, A; isin o(a). Since A; is aroot of O(A) — Ao, we have Q(A;) —Ao = 0, 
i.e., Q(A;) = Ao. Hence Ao is exhibited as QO of the member A; of o (a). 


Corollary 4.45. If A is a complex Banach algebra with identity and if a is in 
A, then r(a”) = r(a)” for every integer n > 1. 

PROOF. This follows by taking Q(A) = A” in Theorem 4.44 and then using 
the definition of the function r. 

Corollary 4.46 (spectral radius formula). If A is a complex Banach algebra 
with identity and if a is in A, then 

r(a) = lim |la"||"/", 
n> 


the limit existing. 
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PROOF. For every , Corollary 4.45 and Proposition 4.37 giver (a)" =r(a") < 
la” || and thus r(a) < |la”||!/". Hence 


r(a) < liming |ja”|[!/”. (x) 


If |A| < |la||~! and 2 is in the dual space A*, then Proposition 4.34 yields 
Gd=Aa) = ge" and therefore €(1—Aa)7!) = re, L(a")A". 


Theorem 4.38 shows that A €((1 — Aa)7!) is analytic for A7! in P(a), and 
Proposition 4.37 shows that this analyticity occurs for |A|~! > r(a), hence for 
|A| < r(a)~!. Therefore the power series ys £(a")A” is convergent for |A| < 
r(a)~!. Since the terms of a convergent series are bounded, each fixed A within 
the disk of convergence must have |£(a”)||A”| < M; for some constant M;. That 
is, 

|e(A"a")| < Me (4) 


for all n. Each linear functional on A* given by € +> €(A"a") is bounded, and 
therefore the system of such linear functionals as n varies, which has been shown in 
(+) to be pointwise bounded, satisfies ||A”a” || < M by the Uniform Boundedness 
Theorem. Consequently |A|||a”||'/" < M'/". Taking the limsup of both sides 
gives |A| lim sup,, lla” ||1/" < 1, and hence lim sup, Ja" ||1/" < |A|7!. Since Ais an 
arbitrary complex number with |A|~! > r (a), we obtain lim sup,, ||a” || VW <r(a). 
In combination with («), this inequality completes the proof. 


The spectral radius formula gives us the following quantitative conclusion 
about the Gelfand transform. 


Corollary 4.47. The Gelfand transform for a commutative complex Banach 
algebra A with identity is norm preserving from A to C(A%) if and only if 
\|a?|| = |la||? for all a in A. 


ProorF. If ||a*|| = |la||? for all a, then induction gives ||a?’ || = |la||?" and 
thus ||a|| = |a?"||?"". Hence ||a|| = lim, ||a?"||?-". This limit equals r (a) by the 
spectral radius formula (Corollary 4.46), and r(a) equals |{a||,,,, by Proposition 
4.43b. Therefore ||a|| = ||| 

Conversely if 2 sup = ||a|| for all a, then r(a) = |la|| by Proposition 4.43b, 
and |a*|| = r(a?) = r(a)? = |la||? by Corollary 4.45. 


sup 


sup* 


This represents some progress. The condition ||a7|| = ||a||? is satisfied in L®, 
and hence the Gelfand transform is a norm-preserving algebra homomorphism of 
L® into C(A*). In L! after an identity is adjoined, the condition ||a?|| = |lal|? 


10. Gelfand Transform for Commutative C* Algebras 157 


is not universally satisfied, and the corollary says that the Gelfand transform, i.e., 
the Fourier transform, is not norm preserving; this conclusion has content, but 
it is not a surprise. In the case of the complex Banach algebra generated by the 
identity and a bounded self-adjoint operator A, the condition ||a*|| = ||a||* is 
satisfied for a = A as a consequence of Proposition 2.2 with L = A*A, but it is 
less transparent what happens with other operators in the Banach algebra that are 
not self adjoint. 

The final step is to bring the operation (-)* into play. An involution of a 
complex Banach algebra A is amap a +> a* of A into itself with the properties 
that the following hold for all a and b in A: 

(i) a* =a, 
(i) (a+b)* =a*+b*, 

(iii) (Aa)* = Aa* for all A in C, 

(iv) (ab)* = b*a*. 

A complex Banach algebra A with involution (-)* is called a C* algebra if 
(v) |la*a|| = |lal|? for all a in A. 


Our examples— B(S) and certain subalgebras, L™, L', and B(H, H) are all 
complex Banach algebras with involution. For B(S) and L°°, the involution is 
complex conjugation. For L! itis f + g with g(x) = f(—x), and for B(H, H) 
it is adjoint. Of these examples all but L! are C* algebras. 

Observe that (i) and (iv) imply that the element 1, if it is present, has to satisfy 
1* = 1 because | = (1*)* = (11*)* = 1**1* = 11* = 1*. If (vy) holds also, then 
(v) with a = 1 shows that ||1|| = 1. 


Theorem 4.48. If A is a commutative C* algebra with identity, then the 
Gelfand transform is a norm-preserving algebra isomorphism of A onto C(A*), 
and it carries (- )* into complex conjugation. 


PROOF. For any a in A, (v) gives |la||* = ||a*a|| < |la*||||a||. If a = 0, then 
a* = 0; otherwise division by ||a|| gives ||a|| < ||a*||. Applying this inequality 
to a* and using (i), we obtain 


lla" || = lal). (*) 


Next suppose that b is an element of A with b* = b. Raising to powers gives 
(b?")* = (b?")* for n > 0. Then (v) gives |b?" || = ||(6""°)*b?"'|| = |b?" 1? 
and induction shows that ||b?" || = ||b||*". Hence ||b|| = ||b?'||?\". Taking the 
limit and applying the spectral radius formula and Proposition 4.43b, we obtain 


|| = lim [1b |?" =r B) = lb laup- (4%) 
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The Gelfand transform is an algebra homomorphism by Proposition 4.43. If a 
general a is given in A, then we can apply () to a and (*«*) to b = a*a to obtain 


= |la*al| 


. Pe 
lla" IIllall = lal” = lla*al| = [1d|] = [ell 


= |la* alleup < lla* ll 


sup sup 


supll@lap < lla" lla, 
the last inequality holding since the Gelfand transform has norm < 1 according to 
Proposition 4.43. The end expressions are equal, and equality must hold through- 
out. Therefore 2 sup = |l|a||, and the Gelfand transform is norm preserving. 

In working toward proving that the Gelfand transform carries (- )* into complex 


conjugation, we first show that 
b*=b implies i isnot in o(b). (t) 


Assuming the contrary, we find that 1 is in o(—ib). By the Spectral Mapping 
Theorem (Theorem 4.44), 4 + 1 isin o(A — ib) for all real 4. Hence 


(A+1)? < (ra —ib))? < ||A — idl? = |] — ib)*(A — ib) 
= ||A+ib)(A — ib)|| = A? +B? |] < AZ IAI + b7 I] = A? + (187 IL, 


and 2A + 1 < ||b||? for all real A. This is a contradiction, and (+) is proved. 
Next let us deduce from (+) that 


b*=b implies 0(b) CR. (+t) 


Suppose that 1 = a + if has w and £ real and B # 0. Then B-'(b — A) = 
B-'(b — a) —i. The element B~!(b — A) has (B~!(b — a))* = B-!(b — @), and 
(+) shows that i is not in its spectrum. Therefore B~'(b — 4) = B-!(b—a@) —i 
is invertible. Since 6 ~ 0, b — A is invertible. Therefore 4 is not in o(b). This 
proves (7). 

Now we shall show that the Gelfand transform carries (-)* into complex 
conjugation. Let a be in A, and write a = S(a +a”) + ((ia) + (ia)*) = b+ic 
with b* = b and c* = c. Then a* = b — ic. From (++) we know that b and @ are 
real-valued. Therefore a* (0) = b(e) —ic(l) = b(e) + ic(€) = a(£), as asserted. 

Since the Gelfand transform is norm preserving, respects products, and car- 
ries | into 1, its image is a uniformly closed subalgebra of C(A*,). The fact 
that (-)* is carried into complex conjugation implies that the image is closed 
under complex conjugation. The image separates points of A by definition of 
equality of linear functionals. By the Stone—Weierstrass Theorem the image is 
all of C(Aj,). This completes the proof. 
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Among our examples, if A is a conjugate-closed Banach subalgebra of B(S) 
with identity, then Theorem 4.48 reproduces the Stone Representation Theorem 
(Theorem 4.15). 

Second if A is L®, Theorem 4.48 gives us something new, identifying L°° 
with C((L™)*,). We do not get a total understanding of (L°)*,, but we do get 
some understanding from the fact that every ideal is contained in a maximal ideal. 
We can produce an ideal in L° merely by specifying a measurable subset; the 
ideal consists of all essentially bounded functions, modulo null functions, that 
vanish on that set. As the set gets smaller, we get closer to the situation of a 
maximal ideal. 

Third if A is L', Theorem 4.48 gives us no information since L! is not a C* 
algebra. The theory of complex Banach algebras can be pursued in a direction 
that specializes to more information about L', but we shall not follow such a 
route. 

Fourth if A is the complex Banach algebra generated by the identity and a 
bounded self-adjoint operator A on a Hilbert space H, then Theorem 4.48 is 
applicable and realizes the algebra as C(A*,). We shall see in the next section 
that A* can be viewed as the spectrum o (A). However, the Hilbert space H plays 
no role in this realization, and we therefore cannot expect to learn much about our 
original operator from C(A*,). For example we cannot distinguish between the 
two operators on C3 given by diagonal matrices diag(1, 1, 2) and diag(1, 2, 2) on 
the basis of the spectrum of each. The goal of the next section is to remedy this 
defect. 

Since we shall want to consider operators in B(H, H) as belonging to more 
than one C* algebra, let us take another look at the definition of the spectrum of 
an element. The spectrum of a, as a member of A, is the set of complex A for 
which (a — A)! fails to exist as a member of A. Certainly if we have A, C A 
and a is in A), then the failure of (a — 4)~! to exist in A> implies the failure of 
(a —A)7! to exist in.A;. Hence the spectrum relative to A; contains the spectrum 
relative to Az. The spectrum is the smallest for A = B(H, H). The following 
corollary implies that for operators A with AA* = A*A, the smallest possible 
spectrum is already achieved for the C* algebra generated by 1, A, and A*. 


Corollary 4.49. If A is a C* algebra with identity and if a is an invertible 
element of A such that aa* = a*a, then a is invertible already in the smallest 
closed subalgebra Ap of A containing 1, a, and a*. 


1 1 


PRooF. Since a~!a* = a~!(a*a)a~! = a~'(aa*)a~! = a*a™', the smallest 
closed subalgebra A; of A containing 1, a, a*, a~!, and a~!* is commutative, 
hence is a commutative C* algebra with identity. Form the Gelfand transform 
be b for A,. Then @ and a! are reciprocals, and the image of @ is therefore 
bounded away from 0. By the Stone—Weierstrass Theorem we can find a sequence 
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{pn(Z, Z)} of polynomial functions that converge uniformly on the compact image 
of a to 1/z. Since by Theorem 4.48, the Gelfand transform is isometric for A, 
we have a! = lim p,(a, a*) in A;, and a7! is therefore exhibited as a member 
of Apo. 
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The goal of this section is to expand upon Theorem 4.48 in the case of a commu- 
tative C* algebra of bounded linear operators on a Hilbert space in such a way that 
the Hilbert space plays a decisive role. The result will be the Spectral Theorem, 
and we shall see how the Spectral Theorem enables one to compute with the 
operators in question. The theorem to be given here is limited to the case of a 
separable Hilbert space, and the assumption of separability will be included in 
all our results about general spaces B(H, H). The Spectral Theorem will enable 
us to view the operators in question as multiplications by L® functions on an L? 
space, and we therefore begin with that example. 


EXAMPLE. Let (S, jz) be a finite measure space, and let H be the Hilbert space 
H = L*(S,). For f in L©(X, w), define My : L? > L* by My(g) = fe. 
The computation 


Im()i3 = | Ifgldu < Irie. f Ig? du = If idole 
xX xX 


shows that My is a bounded operator on H with ||Mf|| < || flloo. Shortly we 
shall check that equality holds: 


|My ll = IF lloo- (*) 
But first, let us observe that 
Myre =MyMy, Mar+pe =¢Mp + BM,, Mp =M;j, Mi =!. 


These facts, in combination with (+), say that f +» Mf is a norm-preserving 
C* algebra isomorphism of the commutative C* algebra L°(S, jz) onto the 
subalgebra 


M(L?(S, 4)) = {My € B(L?(S, 1), L?(S, w)) | f € L©(S, w} 


of the C* algebra B(L*(S, w), L?(S, w)). The algebra M(L?(S, j2)) is called 
the multiplication algebra on L7(S, jz). Returning to the verification of («), let 
€ > 0 be given with € < || f||,,, and let 


E = {x||f@)1 = Ifilloo — €}- 
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Then 0 < u(E) < ov, and we take g to be the function that is 1 on F and is 0 on 
E°. Then ||g||, = w(E)'”, and 


isl =f ifePan =f ifPan = (If lle — 2° H(E). 
Therefore 


(fll — OM(E)'? < Meals < IMeillglls = IMyplle(E)'”, 


and || f Ilo. — € < ||Mel|l. Since we already know that || M;|| < || flloo and since 
€ is arbitrary, we conclude that («) holds. 


Now let us consider an arbitrary bounded self-adjoint linear operator on a 
separable Hilbert space. We mentioned at the end of Section 10 the two operators 
on C3 given by diagonal matrices diag(1, 1, 2) and diag(1, 2, 2). The C* algebras 
generated by these operators are isomorphic 2-dimensional algebras, and hence 
there is no way to superimpose on the setting of Theorem 4.48 the action of the 
operators on the Hilbert space C? if we consider these operators by themselves. 
The operators do get distinguished, however, if we enlarge the C* algebra under 
consideration, working instead with the 3-dimensional commutative C* algebra 
of all diagonal matrices. In the general situation, as long as we are going to 
enlarge the algebra of operators under consideration, we may as well enlarge it 
as much as possible while keeping it commutative. 

If H is a Hilbert space, a maximal abelian self-adjoint subalgebra in 
B(H, H) is a commutative C* subalgebra of B(H, H) that is not contained in 
any larger commutative subalgebra of B(H, H) that is closed under (-)*. In the 
example with H = C? in the previous paragraph, the 3-dimensional algebra of 
diagonal matrices is a maximal abelian self-adjoint subalgebra. 

For general H, we shall obtain a simple criterion for a subalgebra to be maximal 
abelian self-adjoint, we shall show that the multiplication algebra for an L? space 
with respect to a finite measure meets this criterion, and then we shall see that 
maximal abelian self-adjoint subalgebras have a special property that will allow 
us to incorporate the Hilbert space into an application of Theorem 4.48. 

If J is a subset of b(H, H), let 


T’ ={A € B(H, H)| AB = BA forall B € T}. 


The set TJ’ is a subalgebra of B(H, H) containing the identity and called the 
commuting algebra of 7. It has the following properties: 

(i) T’ is closed in the operator-norm topology, 

(ii) J’ D Tif and only if Tis commutative, 
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(iii) if Tis stable under (-)*, then J’ is stable under (-)* and hence is a C* 
subalgebra of B(H, H), 
(iv) a subalgebra A of B(H, H) stable under (- )* is a maximal abelian self- 
adjoint subalgebra of B(H, H) if and only if A’ = A. 
All of these properties are verified by inspection except possibly the assertion in 
(iv) that A maximal implies that A’ does not strictly contain A. For this assertion 
let A be maximal, and suppose that B lies in A’ but not A. Since A is stable under 
(-)*, B* lies in A’, and so does B + B*. Then B + B* and A together generate 
a C* subalgebra that is commutative and strictly contains A, in contradiction to 
the maximality of A. This proves (iv). 


Proposition 4.50. If (S, 4) is a finite measure space, then the multiplication 
algebra on L?(S, ) is a maximal abelian self-adjoint subalgebra of the algebra 
B(L?(S, ), L?(S, )). 


PROOF. Write M for M(L7(S, j4)). Since M is commutative, (ii) shows that 
M' Dd M. Since M is stable under (-)*, (iv) shows that it is enough to prove 
that M’ C M. Thus let T be in M’, and define an L? function g by g = T(1). 
If f is in L™, then the fact that T is in M’ implies that 


Tf =TMy(1) = M;T (1) = Myg = fe. 


If the set where N < |g(x)| < N + 1 has positive measure, then an argument in 
the example with L?(S, jz) shows that ||7'|| > N. Since T is assumed bounded, 
we conclude that g is in L*°. Therefore Tf = M, f for all f in L©. Since L* 
is dense in L” for a finite measure space and since T and M, are both bounded, 
T = Mg. Therefore T is exhibited as in M, and the proof that M’ C M is 
complete. 


We come now to the special property of maximal abelian self-adjoint subalge- 
bras that will allow us to bring the Hilbert space into play when applying Theorem 
4.48 to these subalgebras. If A is any subalgebra of B(H, H), a vector x in H is 
called a cyclic vector for A if the vector subspace Ax of H is dense in H. 


Lemma 4.51. Let H be acomplex Hilbert space, let K C H beaclosed vector 
subspace, and let E be the orthogonal projection of H on K. If A is a subalgebra 
of B(H, H) that is stable under (- )* and has the property that A(K) C K for all 
Ain A, then E is in A’. 


PROOF. Since A(K) C K, AE(x) isin K for all x in H. Therefore AE (x) = 
EAE (x) for all x in H, and AE = EAE. Since E* = E and since A is 
stable under (-)*, A*E = EA*E. Consequently EA = E*A = (A*E)* = 
(E A*E)* = EAE = AE, and E isin A’. 
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Proposition 4.52. If H is a complex separable Hilbert space and A is a 
maximal self-adjoint subalgebra of B(H, H), then A has a cyclic vector. 


REMARKS. The 2-dimensional subalgebras that we introduced in connection 
with C? have no cyclic vectors, as we see by a count of dimensions; however, the 


1 
full 3-dimensional diagonal subalgebra has ( 1 ) as a cyclic vector since 
1 


a 0 0 1 a 

0 0 ¢ 1 c 
PROOF. For each x in H, form the closed vector subspace (Ax). Since the 
identity is in A, x is in Ax. Since Ax is stable under A and since the members 


of A are bounded operators, (Ax)*! is stable under A. The vector subspace Ax 
has the property that 


y 1 Ax implies Ay 1 Ax (*) 


because (Ax, By) = (y, A*Bx) = Oif A and B are in A. Consider orthonormal 
subsets {x,} in H such that Ax,  Axg fora ¢ B. Such sets exist, the empty set 
being one. By Zorn’s Lemma let S = {x,} be a maximal such set. This maximal 
S has the property that 


since otherwise we could obtain a contradiction by adjoining any unit vector in 
(( Ds Axil?) to S and applying (+). Since H is separable, S$ is countable. 
Let us enumerate its members as x,, x2,.... Put z = Ser 2~"x,. This series 
converges in H since H is complete, and we shall prove that the sum z is a cyclic 
vector for A. 

Lemma 4.51 implies that the orthogonal projection E,, of H onto (Ax,)* is in 
A’. Since A is a maximal abelian self-adjoint subalgebra of B(H, H), A’ = A. 
Hence E,, is in A. Therefore Az D> AE,z = A2™"x, = Ax, for all n, and we 
obtain (Az)! D ea Ag = H. This completes the proof. 


If H, and Hy are complex Hilbert spaces, a unitary operator U from Hj to 
Hy is a linear operator from A, onto H> with Uxlly, = Ixy, for all x in Aj. 
Such an operator is invertible, and its inverse is unitary. By means of polarization, 
one sees that a unitary operator satisfies also the identity (Ux, Uy) mH = Oa 
i.e., that the inner product is preserved. Therefore a unitary operator provides the 
natural notion of isomorphism between two Hilbert spaces. 
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Theorem 4.53. If H is a nonzero complex separable Hilbert space and A is a 
maximal abelian self-adjoint subalgebra of B(H, H), then there exists a measure 
space (S, jz) with w(S) = 1 and a unitary operator U : H — L?(S, w) such that 


UAU7! = M(L7(S, p)). 


REMARK. In other words, under the assumption that H is separable, any maxi- 
mal abelian self-adjoint subalgebra of B(H, H) is isomorphic to the multiplication 
algebra for the L” space relative to some finite measure. 


Proor. Applying Proposition 4.52, let z be a unit cyclic vector for A. Let 
us see that the linear map of A into H given by A +> Az is one-one. In fact, if 
Az = 0, then every B in A has A(Bz) = BAz = BO = 0. Since Az is dense in 
H and A is bounded, A is 0. 

We saw before Proposition 4.50 that A is a commutative C* algebra with 
identity. By Theorem 4.48 the Gelfand transform A +> A is a norm-preserving 
algebra isomorphism of A onto C(A*) carrying (-)* to complex conjugation. 
Define a linear functional £ on C(A*,) by 


€(A) = (Az, 2) ys 


the inner product being the inner product in H. Let us see that the linear functional 
£ is positive. In fact, any function > 0 in C(A®) is the absolute value squared of 
some element of C(A*,), hence is of the form |A|*. Then 


n~ 


e(\Al?) = (A A) = &(A*A) = (A* AZ, 2) yy = (Az, AZ) y > O. 


By the Riesz Representation Theorem, there exists a unique regular Borel 
measure jz on A®, such that 


e(A) = fy, Adu 


for all A in C(A%). The measure jz has total mass equal to €(1) = eq) = 
Uz, 2)4 = lel = 1. 

We shall now construct the unitary operator U carrying H to L?(A*, 2). On 
the dense vector subspace Az of H, define a linear mapping Up by 


UjAz = AE C(A*) C L?(AX, w). 


This is well defined since, as we have seen, Az = 0 implies A = 0. On the vector 
subspace Az, we have 
[Uo Az}, 


(ayy = Lag AP d= fy, APA dp = €(A*A) = (A*AZ, 2) 7 =A. 
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Hence Up is an isometry from the dense subset Az of H into L?(A*,). By uniform 
continuity, Up extends to an isometry U from H into EA); As the continuous 
extension of the linear function Uo, U is linear. The image of U contains C(.A%), 
which is dense in LAS, 4), and the image is complete, being isometric with H. 
Therefore the image of U is closed. Consequently U carries H onto EAs LL) 
and is unitary. 
We still have to check that UAU~! = M(L?(A*, 1). If A and B are in A, 
then ” we ra 
UAU~'(B) = UA(Bz) = U(ABz) = AB = AB=M;B. 


Since U AU! and Mj; are bounded and since the B’s are dense in Lit At. LL), 
UAU~! = Mj. Therefore UAU~! C M(L?(A%, 1)). Next let T be in 
M(L? (At, w)). Then T commutes with every member of M(L?(A*, j2)) and 
in particular with every UAU~!. Thus TUAU~! = UAU™'T for all A in A, 
and U~'TUA = AU™'TU. Since A is arbitrary in. A, U~'!TU isin A’. But Ais 
assumed to be a maximal abelian self-adjoint subalgebra, and therefore A’ = A. 
Consequently U~'TU is in A. Say that U-!'TU = Ao. Then T = UAgU™!, 
and T is in UAU~!. Therefore UAU~! = M(L?(A*, 14). 


The Spectral Theorem for a single bounded self-adjoint operator will be an 
immediate consequence of Theorem 4.53 and an application of Zorn’s Lemma. 
But let us state the result (Theorem 4.54) so that it applies to a wider class of 
operators — and to a commuting family of such operators rather than just one. 

The first step is to define the kinds of bounded linear operators of interest. Let 
H be acomplex Hilbert space. A bounded linear operator A : H — H is said to 
be 


e normal if A*A = AA%, 
e positive semidefinite if it is self adjoint?” and (Ax, x) > Oforall x € H, 
e unitary if A is onto A and has || Ax|| = ||x|| for allx € H. 


Self-adjoint operators, having A* = A, are certainly normal. Every operator of 
the form A* A for some bounded linear A is positive semidefinite. The definition 
of “unitary” merely specializes the definition before Theorem 4.53 to the case that 
H, = Hy). Unitary operators A in the present setting, according to Proposition 
2.6, are characterized by the condition that A is invertible with A! = A*, and 
unitary operators are therefore normal. 

In the case of multiplication operators My by L© functions on L? of a finite 
measure space, the adjoint of My is M;. Then every M; is normal, My is self 
adjoint if and only if f is real-valued a.e., My is positive semidefinite if and only 
if f is > O0ae., and M; is unitary if and only if | f| = lae. 


~The condition “self adjoint” can be shown to be automatic in the presence of the inequality 
(Ax, x) > 0 for all x, but we shall not need to make use of this fact. 
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Theorem 4.54 (Spectral Theorem for bounded normal operators). Let {Ag}ace 
be a family of bounded normal operators on a complex separable Hilbert space 
H,, and suppose that Ay Ag = Ag Ay and Ag As = Ap Aa for all a and 6. Then 
there exist a finite measure space (S, j4),a unitary operator U : H > L?(S, 1), 
and a set { falwer of functions in L*(S, j2) such that UA,U! = Mr, for all a 
in E. 

PROOF. Let Ap be the complex subalgebra of B(H, H) generated by / and all 
A, and A* for a in E. This algebra is commutative and is stable under (- )*. Let 
S be the set of all commutative subalgebras of B(H, H) containing Ap and stable 
under (- )*, and partially order S by inclusion upward. The union of the members 
of a chain in S is an upper bound for the chain, and Zorn’s Lemma therefore 
produces a maximal element A in S. Since A is maximal, it is necessarily 
closed in the operator-norm topology. Then A is a maximal abelian self-adjoint 
subalgebra of B(H, H), and Theorem 4.53 is applicable. The theorem yields a 
finite measure space (S, 2) and a unitary operator U : H > L*(S, ) such that 
UAU~! = M(L?(S, 2)). For each a in E, we then have UA,U7! = My, for 
some fy in L~°(S, j2), as required. 


In acorollary we shall characterize the spectra of operators that are self adjoint, 
or positive definite, or unitary. Implicitly in the statement and proof, we make 
use of Corollary 4.49 when referring to o(A): the set o(A) is independent of 
the Banach subalgebra of B(H, H) from which it is computed as long as that 
subalgebra contains 7, A, and A*. The corollary needs one further thing beyond 
Theorem 4.54, and we give that in the lemma below. 


Lemma 4.55. Let (S, 2) be a finite measure space, and form the Hilbert space 
L?(S, w). For f in L®(S, pw), let M; be the operation of multiplication by f. 
Define the essential image of f to be 


{Ao € C| u(f 1a EeC| |A — Aol < €})) > 0 for every € > 0}. 


Then 
o(M;) = essential image of f. 


PROOF. To prove C in the asserted equality, let Ao be outside the essential 
image, and choose € > 0 such that f~!({|A — Ao| < €}) has measure 0. Then 
| f (x) — Ao] = € a.e. Hence 1/(f — Ao) is in L®, and Mj /¢—9) exhibits My_,, 
as invertible. Thus Ao is not ino (M,). 

For the inclusion >, suppose that My_,, is invertible, with inverse 7. For every 
g in L~, we have My_,,M, = M,My;-_,,. Multiplying this equality by T twice, 
we obtain M,T = T M,. By Proposition 4.50, T is of the form T = M, for some 
hin L®. Then we must have (f — Ao)h = 1 a.e. Hence | f(x) — Ao| = Allo 
a.e., and Ag is outside the essential image. This proves the lemma. 
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Corollary 4.56. Let H be acomplex separable Hilbert space, let A be a normal 
operator in B(H, H), and let o (A) be the spectrum of A. Then 
(a) A is self adjoint if and only if o(A) CR, 
(b) A is positive semidefinite if and only if o(A) © [0, +00), 
(c) A is unitary if and only if o(A) C {z € ou lz] = Li. 


PROOF. The corollary is immediate from Theorem 4.54 as long as the corollary 
is proved for any multiplication operator A = My by an L™ function f on the 
Hilbert space L?(S, ww). For this purpose we shall use Lemma 4.55. 

In the case of (a), the operator M; is self adjoint if and only if f is real-valued 
a.e. If f is real-valued, then the definition of essential image shows that Ao is not 
in the essential image if Ao is nonreal. Conversely if every nonreal Ao is outside 
the essential image, then to each such Ao we can associate a number €,, > O for 
which f~'({A € C | |A — Ao| < €,,}) has w measure 0. Countably many of the 
open sets {A € C | |A — Aol < €,,} cover the complement of R in C, and their 
inverse images under f have jz measure 0. Therefore the inverse image under f 
of the union has jz measure 0, and j( f~!(R°)) = 0. That is, f is real-valued a.e. 
This proves (a), and the arguments for (b) and (c) are completely analogous. 


The power of the Spectral Theorem comes through the functional calculus that 
it implies for working with operators. We shall prove the relevant theorem and 
then give five illustrations of how it is used. 


Theorem 4.57 (functional calculus). Fix a bounded normal operator A on a 
complex separable Hilbert space H. Then there exists one and only one way to 
define a system of operators g(A) for every bounded Borel function g on a (A) 
such that 

(a) z(A) = A for the function g(z) = z, and 1(A) = J for the constant 
function 1, 
(b) g +> g(A) is an algebra homomorphism into B(H, H), 
(c) p(A)* = GA), 
(d) lim, @,(A)x = g(A)x for all x € H whenever g, — @ pointwise with 
{g,} uniformly bounded. 
The operators g(A) have the additional properties that 


(e) g(A) is normal, and all the operators g(A) commute, 

(£) (ADI < lIPllsup> 

(g) limp @p(A) = ¢(A) in the operator-norm topology whenever g, — 
uniformly, 

(h) o(g(A)) € (v(o(A))", 

(i) (spectral mapping property) o(¢(A)) = ¢(o(A)) if ¢ is continuous. 
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PROOF OF EXISTENCE. Apply Theorem 4.54 to the singleton set {A}, obtaining 
a finite measure space (S, j4),a unitary operator U : H — L?(S, ),and an L© 
function f4 on S such that UAU~!' = M,y,. Examining the proofs of Theorems 
4.53 and 4.54, we see that we can take S to be a certain compact Hausdorff space 
A*, to be a regular Borel measure on S, and the function f,4 to be the Gelfand 
transform A, therefore continuous. In the construction of Theorem 4.53, the 
measure jz has the property that he |B?’ dp = || Bz||,, for every B in A, where 
zis acyclic vector. Therefore B 4 0 implies of s IB \>du > 0. Since |B * is the 
most general continuous function > 0 on S, yz assigns positive measure to every 
nonempty open set. 

For any bounded Borel function g on o(A), the function g o f, is a well- 
defined function on S since Proposition 4.43a shows that the image of A= fA 
is o(A). The function y o f,4 is a bounded Borel function since y~! of an open 
set in C is a Borel set of C and since hire of a Borel set of C is a Borel set of S. 
Thus it makes sense to define 


Q(A) = UT! Myo, U. 


Then we see that properties (a) through (i) are satisfied for any given normal 
A on H if they are valid in the special case of any My on L?(S, w) with f 
continuous, S compact Hausdorff, a regular Borel measure assigning positive 
measure to every nonempty open set, and y(My) defined for arbitrary bounded 
Borel functions g on the image of f by 


y(M;) = Moof.- 


Properties (a) through (c) for multiplication operators are immediate, (d) follows 
by dominated convergence, (e) and (f) are immediate, and (g) follows directly 
from (f). We are left with properties (h) and (i). 

Lemma 4.55 identifies the spectrum of a multiplication operator by an L° 
function with the essential image of the function. Using this identification, we 
see that (h) and (i) follow in our special case if it is proved for f continuous that 


essential image of go f C (p(essential image of f ))", g bounded Borel, (*) 


essential image of go f = g(essential image of f), gy continuous. (+) 


Let us see that these follow if we prove that 


essential image of w C (image yy for w : S + C bounded Borel, (7) 


essential image of y = image w for Ww: S — C continuous. (T1) 
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In fact, if (+) and (7+) hold, then for («) we have 


essential image(y o f) C (image(y o f))" by (}) for go f 
= (y(image f))" 
= (y(essential image f))*! by (f+) for f. 


For (««) we have 
essential image(g o f) = image(g o f) by (+1) for go f 


g(image f) 
= g(essential image f) by (+7) for f. 


Thus it is enough to prove (+) and ({7). For (+) let Ao be in the essential 
image of yw. Then for eachn > 1, u(w'{ar | |A — Ag| < t\) > 0, and hence 


n 

yw f{a | |A — ro| < 1) # @. Thus there exists A = i, with 2, in the image of 
w such that |A — Ao| < 1, and Ao is exhibited as a member of (image y)*'. 

For (++) we first show that the image of w lies in the essential image of yw if 
w is continuous. Thus let Ag be in the image of y. Then y~! {r | |A — Aol < e} 
is nonempty, and it is open since y is continuous. Since nonempty open sets of 
S have positive 4 measure, we conclude that Ao is in the essential image of y. 
Then 


image w C essential image w by what we have just proved 
C (image 7)" by (#) 
= image W since § is compact and w is continuous, 


and (+7) follows. This completes the proof of existence and the list of properties 
in Theorem 4.57. 


PROOF OF UNIQUENESS. Properties (a) through (c) determine g(A) whenever o 
is a polynomial function of z and z. By the Stone—Weierstrass Theorem any con- 
tinuous g on a compact set such as o (A) is the uniform limit of such polynomials, 
and hence (d) implies that (A) is determined whenever ¢ is continuous. 

The indicator function of a compact subset of C is the decreasing pointwise 
limit of a sequence of continuous functions of compact support, and hence (d) 
implies that g(A) is determined whenever ¢ is the indicator function of a compact 
set. Applying (b) twice, we see that y(A) is determined whenever ¢ is the 
indicator function of any finite disjoint union of differences of compact sets. 
Such sets form? the smallest algebra of sets containing the compact subsets of 


By Lemma 11.2 of Basic. 
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o(A). Another application of (d), together with the Monotone Class Lemma,” 


shows that g(A) is determined whenever ¢ is the indicator function of any Borel 
subset of o(A). Any bounded Borel function on o (A) is the uniform limit of 
finite linear combinations of indicator functions of Borel sets, and hence one more 
application of (b) and (d) shows that g(A) is determined whenever ¢ is a bounded 
Borel function on o (A). 


Corollary 4.58. If H is a complex separable Hilbert space, then every positive 
semidefinite operator in B(H, H) has a unique positive semidefinite square root. 


REMARKS. This is an important application of the Spectral Theorem and the 
functional calculus. It is already important when applied to operators of the form 
A*A with A in B(H, H). For example the corollary allows us in the definition 
of trace-class operator before Proposition 2.8 to drop the assumption that the 
operator is compact; it is enough to assume that it is bounded. 


PROOF. If A is positive semidefinite, then 0 (A) C [0, 00) by Corollary 4.56b. 
The usual square root function J on[0, co) is bounded on o (A), and we can form 
A by Theorem 4.57. Then (a) and (b) in Theorem 4.57 imply that (VA)? = A, 
and (i) implies that VA is positive semidefinite. This proves existence. 

For uniqueness let B be positive semidefinite with B? = A. Because of 
the uniqueness assertion in Theorem 4.57, we have at our disposal the maximal 
abelian self-adjoint subalgebra of B(H, H) that is recalled from Theorem 4.53 
and used to define operators g(A) in the proof of Theorem 4.57. Let Apo be the 
smallest C* algebra in B(H, H) containing 7, A, and B, and extend Ap to a 
maximal abelian self-adjoint subalgebra Ac of B(H, H). We use this A to define 


VA. On the compact Hausdorff space, VA and B are both nonnegative square 
roots of A and must be equal. Since the Gelfand transform for A is one-one, 


B=VJA. 


Corollary 4.59. Let H be a complex separable Hilbert space, and let A and B 
be bounded normal operators on H such that A commutes with B and B*. Then 
each g(A), for gy a bounded Borel function on 0 (A), commutes with B and B*. 


PRooF. As in the proof of the previous corollary, we have at our disposal 
the maximal abelian self-adjoint subalgebra A of B(H, H) that is used to define 
operators p(A). We choose one containing 7, A, and B. Then g(A) is in A and 
hence commutes with B and B*. 


Corollary 4.60. Let A be a bounded normal operator on a complex 
separable Hilbert space, let gz : o(A) — C be a continuous function, 


247 emma 5.43 of Basic. 
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and let g; : @2(0(A)) — C be a bounded Borel function. Then g;(@2(A)) = 
(P1 0 G2) (A). 


REMARK. If ¢2(z) = Z, this corollary recovers property (c) in Theorem 4.57. 


PROOF. The uniqueness in Theorem 4.57 shows that the operators p(¢2(A)) 
form the unique system defined for bounded Borel functions g : 0 (g@2(A)) > C 
such that z(¢@2(A)) = @2(A), 1(@2(A)) = 1, ¢  ¢Y(@~(A) is an algebra homo- 
morphism, ($2(A))* = @(g2(A)), and lim g,(g2(A))x = g(G2(A))x for all x 
whenever 9, > @ pointwise and boundedly on 0 (¢2(A)). 

We now consider the system formed from (A), specialize to functions yy = 
gy © g, and make use of the properties of (A) as stated in the existence half 
of the theorem. Theorem 4.571 shows that o(¢2(A)) = @2(0(A)). We have 
(Zz 0 @2)(A) = @2(A) trivially and (1 0 g2)(A) = 1(A) = 1 by (a) for the system 
w(A). The map g > (¢ 0 @2)(A) is an algebra homomorphism as a special case 
of (b) for (A). The formula (go @2)(A)* = G0 G2(A) = (Pog2)(A) is a special 
case of (c) for (A). And the formula lim(g, o g2)(A)x = (@ 0 @2)(A)x Is a 
special case of (d) for (A). Therefore the system (g © ¢2)(A) has the properties 
that uniquely determine the system (@2(A)), and we must have g(g2(A)) = 
(g © ¢2)(A) for every bounded Borel function g on 0 (¢2(A)). 


Corollary 4.61. If A is a bounded normal operator on a complex separable 
Hilbert space, then there exists a sequence {S,} of bounded linear operators of 
the form S, = so06 Ci.nEj,n converging to A in the operator-norm topology and 
having the property that each £;,, is an orthogonal projection of the form (A). 


PROOF. Choose a sequence of simple Borel functions s, on o (A) converging 
uniformly to the function z, and let S, = s,(A). Then apply Theorem 4.57. 


Corollary 4.62. If A is a bounded normal operator on a complex separable 
Hilbert space H of dimension > 1, then there exists a nontrivial orthogonal 
projection that commutes with every bounded normal operator that commutes 
with A and A*. Hence there is a nonzero proper closed vector subspace K of H 
such that B(K) C K for every bounded normal operator B commuting with A 
and A*. 


PROOF. This is a special case of Corollary 4.61. 


This completes our list of illustrations of the functional calculus associated 
with the Spectral Theorem. We now prove a result mentioned near the end of 
Section 10, showing how the spectrum of an operator relates to spaces of maximal 
ideals. 
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Proposition 4.63. Let A be a bounded normal operator on a complex separable 
Hilbert space H,, and let A be the smallest C* algebra of B(H, H) containing /, 
A,and A*. Then the maximal ideal space A*, is canonically homeomorphic to 
the spectrum o (A). 


PROOF. Let B +> B be the Gelfand transform for A, carrying A to C(A%). 
Proposition 4.43a shows that the image of A in C is o(A), and Corollary 4.49 
shows that this version of a (A) is the same as the one obtained from 6(H, A). 
Therefore we obtain a map C(a(A)) > C(A%) by the definition f tH fo A. 
This map is an algebra homomorphism respecting conjugation, and it satisfies 
If lhe =||fo A leig since the function A is onto o(A). This equality of norms 


implies that the map f + fo A is one-one. 

To see that f rH fo Ais onto C (A%,), we observe that the operators p(A, A*), 
for p a polynomial in z and z, are dense in A since J, A, and A* generate A. 
Using that () is a norm-preserving isomorphism of A onto C(A%,), we see 
that the members piA, A*) of C(Aj,) are dense in C(A‘). Since C(o(A)) is 
complete and f +> fo A is norm preserving, the image is closed. Therefore 
ft fo Acarries C(a(A)) onto C(AS). 

Hence we have a canonical isomorphism of commutative C* algebras C (a (A)) 
and C (.A*,). The maximal ideal spaces must be canonically homeomorphic. The 
maximal ideal space of C(a(A)) contains o (A) because of the point evaluations 
but can be no larger than o (A) since the Stone Representation Theorem (Theorem 
4.15) shows that the necessarily closed image of o (A) is dense in (C (o (A)))*. 


FURTHER REMARKS. A version of the Spectral Theorem is valid also for 
unbounded self-adjoint operators on a complex separable Hilbert space. Such op- 
erators are of importance since they enable one to use functional analysis directly 
with linear differential operators, which may be expected to be unbounded. The 
operator L in the Sturm—Liouville theory of Chapter I is an example of the kind 
of operator that one wants to handle directly. The subject has to address a large 
number of technical details, particularly concerning domains of operators, and the 
definitions have to be made just right. The prototype of an unbounded self-adjoint 
operator is the multiplication operator My on our usual L?(S, 2) corresponding 
to an unbounded real-valued function f that is finite almost everywhere; the 
domain of My is the dense vector subspace of members of L” whose product 
with f is in L?. Just as in this example, the domain of an unbounded self-adjoint 
operator is forced by the definitions to be a dense but proper vector subspace of 
the whole Hilbert space. Once one is finally able to state the Spectral Theorem 
for unbounded self-adjoint operators precisely, the result is proved by reducing 
it to Theorem 4.54. Specifically if T is an unbounded self-adjoint operator on 
H , then one shows that (J +i)~! is a globally defined bounded normal operator. 
Application of Theorem 4.54 to (T +i)7! yields an L© function g such that the 
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unitary operator U : H > L?(S, ) carries (T + i)~! to g. One wants T to 
be carried to f, and hence the definition should force 1/(f +i) = g. In other 
words, f is defined by the equation f = 1/g —i. One checks that the unitary 
operator U from H to L? indeed carries T to M;. For a discussion of the use of 
the Spectral Theorem in connection with partial differential equations, the reader 
can look at Parts 2 and 3 of Dunford—Schwartz’s Linear Operators. 


BIBLIOGRAPHICAL REMARKS. The exposition in Section 3-6 and Section 
8-9 is based on that in Part 1 of Dunford—Schwartz’s Linear Operators. The 
exposition in Section 7 is based on that in Treves’s Topological Vector Spaces, 
Distributions and Kernels. 


12. Problems 


1. Let X be a Banach space, and let Y be a closed vector subspace. Take as known 
(from Problem 4 in Chapter XII of Basic) that X/Y becomes a normed linear 
space under the definition ||x + Y|| = infycy ||x + y|| and that the resulting norm 
is complete. Prove that the topology on X/Y obtained this way coincides with 
the quotient topology on X/Y as the quotient of a topological vector space by a 
closed vector subspace. 

2. Let T : X — Y bea linear function between Banach spaces such that T (X) is 
finite-dimensional and ker(7)) is closed. Prove that T is continuous. 


3. Using the result of Problem 1, derive the Interior Mapping Theorem for Banach 
spaces from the special case in which the mapping is one-one. 

4. If X is a finite-dimensional normed linear space, why must the norm topology 
coincide with the weak topology? 

5. Let H be a separable infinite-dimensional Hilbert space. Give an example of a 
sequence {x,} in H with ||x,|| = 1 for all n and with {x,,} tending to 0 weakly. 


6. Inao-finite measure space (S, jz), suppose that the sequence { f,} tends weakly 
to f in L?(S, w) and that lim, I fnllo = Il f\l,- Prove that {f,} tends to f in the 
norm topology of L?(S, ). 

7. Let X be anormed linear space, let {x,,} be a sequence in X with {||x,, ||} bounded, 
and let xo be in X. Prove that if lim, x*(x,) = x*(xo) for all x* in a dense subset 
of X*, then {x,} tends to x9 weakly. 

8. Fix p with O < p < 1. It was shown in Section | that the set of Borel functions 
f on [0, 1] with tron |f |? dx < oo, with two functions identified when they are 
equal almost everywhere, forms a topological vector space L? ([0, 1]) under the 
metric d(f, g) = tio. |f — g|dx. Put D(f) = troy | f |? dx. 

(a) Show for each positive integer n that any function f with D(f) = 1 can be 
written as f = 4(f, +--+ + fn) with D(fj) =n”), 
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(b) Deduce from (a) that if f has D( f) = 1, then an arbitrarily large multiple of 
f can be written as a convex combination of functions f; with D(f;) < 1. 

(c) Deduce from (b) for each ¢ > 0 that the smallest convex set containing all 
f’s with D(f) < eis all of L?((0, 1)). 

(d) Why must L?([0, 1]) fail to be locally convex? 

(e) Prove that L?([0, 1]) has no nonzero continuous linear functionals. 


Let U be anonempty open set in RY , and let {K p}p>o be an exhausting sequence 
of compact subsets of U with Ko = ©. Let M be the set of all monotone 
increasing sequences of integers m, > O tending to infinity, and let E be the set 
of all monotone decreasing sequences of real numbers ¢, > 0 tending to 0. For 
each pair (m,€) = ({mp}, {€p}) with m € M and « € E, define a seminorm 
II = Mme OD CSm(U) by 


com 


IPllme = supe, '(sup sup |(D*¢y)(x)|). 
p20 x€Ky |al<mp 


Denote the inductive limit topology on C&°..(U) by J and the topology defined 


com 
with the above uncountable family of seminorms by 7’. 


(a) Verify for g in C®(U) that ||@||,, , < co for all pairs (m, €) if and only if g 


is in CX,,(U). 
(b) Prove that the identity mapping (Cgo,,(U),T ) — (Co&,(U),T") is 
continuous. 
(c) For p => 0, fix y, = 0 in Cg (U) with ae WwW, = 1, Yo #0 on Ko, and 
# Oforx in Kpy2 — K®,,, 
Wp (x) a 


= 0 for x in (K?,3)° and for x in Kp. 


A basic open neighborhood N of 0 in (C35,,(U), TZ) is a convex circled set 
with 0 as an internal point satisfying conditions of the following form: for 
each p => 0, there exist an integer n, and a real 6, > O such that a member 
gofCR , isinNOC® , ifand only if sup, ex,,, SUPja\<n, ID“ P(X)| < dp. 
Prove that there exists a pair (m,é) such that ||@||,,,,.< 1 implies that 
27+! Wp isinNO CRs for all p > 0. 
With notation as in (c), show that the function y = )°,..9 2- PD (2Pt yg) 
is in N whenever ||9|l,,,, < 1. Conclude that the identity mapping from 
(CXm(U), T") to (CSS, (U), T) is continuous and that J and 7’ are therefore 
the same. 
(e) Exhibit a sequence of closed nowhere dense subsets of Cg§,,(U) with union 
CXSm(U), thereby showing that the hypotheses of the Baire Category Theo- 
rem must not be satisfied in C%_(U). 


com 
Prove or disprove: If H is an infinite-dimensional separable Hilbert space, then 
B(H, #) is separable in the operator-norm topology. 


(d 
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15. 
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Let S be a compact Hausdorff space, let 2 be a regular Borel measure on S, 
and regard A = {multiplications by C(S)} as a subalgebra of M(L7(S, 11). 
Prove that the commuting algebra A’ of A within B(L7(S, 2), L?(S, 1) is 
M(L*(S, 1). 

Prove that if A is a bounded normal operator on a separable complex Hilbert 
space H, then ||A|| = sup),y<) |((Ax, x) y|- 

Let H be a separable complex Hilbert space, let A be a commutative C* sub- 
algebra of B(H, H) with identity, and suppose that A has a cyclic vector. 
Prove that there exist a regular Borel measure jz on A‘, and a unitary operator 
U:H—> As. 4) such that 


U AU! = {multiplications by C(A*%)} € M(L?(A%, 1). 


Let A be a bounded normal operator on a separable complex Hilbert space H, 
and let A be the smallest C* subalgebra of B(H, H) containing J, A, and A*. 
Suppose that A has a cyclic vector. Prove that there exists a Borel measure on 
the spectrum o (A) and a unitary mapping U : H > L?(o(A), 4) such that 


UAU™! = {multiplications by C(o(A))} C M(L?(o(A), 1)) 


and such that UAU™! is the multiplication operator M,. 


Form the multiplication operator M, on L?({O, 1]), and let A be the smallest C* 

subalgebra of B(L7({0, 1]), L7({O, 1])) containing J and M,. 

(a) Prove that the function 1 is a cyclic vector for A. 

(b) Identify the spectrum o(M,). 

(c) Prove in the context of the functional calculus of the Spectral Theorem 
that the operator p(M,) is M, for every bounded Borel function ¢ on the 
spectrum o(M,). 

Let A and B be bounded normal operators on a separable complex Hilbert space 

H such that A commutes with B and B*. Let A be the smallest C* subalgebra 

of B(H, H) containing 7, A, A*, B, and B*. 

(a) Prove that A*, is canonically homeomorphic to the subset o(A, B) of 
o(A) x o(B) € C? given by o (A, B) = {(A(0), BU) }eeas- 

(b) Prove under the identification of (a) that A is identified with the function Z1 
and B is identified with Z2. 


Problems 17—20 concern the set of extreme points in particular closed subsets of 
locally convex topological vector spaces. 


17. 


Let S be a compact Hausdorff space, and let K be the set of all regular Borel 
measures on S with j4(S) < 1. Give K the weak-star topology relative to C(S). 
Prove that the extreme points of K are 0 and the point masses of total measure 1. 
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18. 


19. 


20. 
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In L!({0, 1]), suppose that f has norm | and that E is a Borel subset such that 
Jn \fldx > Oand f,. |fldx > 0. Let fi; be f on E and be 0 on E‘, and let fy 
be f on E* and be 0 on E. 

(a) Prove that f is a nontrivial convex combination of || fi ie fi and || Ally’ jo: 
(b) Conclude that the closed unit ball of L!({0, 1]) has no extreme points. 


Let S be a compact Hausdorff space, and let K be the set of all regular Borel 

measures on § with w(S) = 1. Give K, the weak-star topology relative to C(S). 

Let F be a homeomorphism of S. Within K,, let K be the subset of members 

u. of K, that are F invariant in the sense that w(E) = u(F-!(B)) for all Borel 

sets E. 

(a) Prove that K is a compact convex subset of M(S) in the weak-star topology 
relative to C(S). 

(b) A member pu of K is said to be ergodic if every Borel set E such that 
F(E) = E has the property that u(E) = 0 or w(E) = 1. Prove that every 
extreme point of K is ergodic. 

(c) Is every ergodic measure in K necessarily an extreme point? 


Regard the set Z of integers as a measure space with the counting measure 

imposed. As in Section 8, a complex-valued function f(n) on Z is said to be 

positive definite if 0), eC) fi - k)c(k) = 0 for all complex-valued functions 

c(n) on the integers with finite support. 

(a) Prove that every positive definite function f has f(0) > 0, f(—n) = fin), 
and | f(n)| < f (0). 

(b) Prove that a bounded sequence in L®(Z) converges weak-star relative to 
L'(Z) if and only if it converges pointwise. 

(c) In view of (a), the set K of positive definite functions f with f(1) = lisa 
subset of the closed unit ball of L°(Z). Prove that the set K is convex and 
is compact in the weak-star topology relative to L'(Z). 

(d) Prove that every function fg(n) = e'”° with 6 real is an extreme point of K. 

(e) Take for granted the fact that every positive definite function on Z is the 
sequence of Fourier coefficients of some Borel measure on the circle. (The 
corresponding fact for positive definite functions on R* is proved in Prob- 
lems 8-12 of Chapter VIII of Basic.) Prove that the set K has no other 
extreme points besides the ones in (d). 


Problems 21-25 elaborate on the Stone Representation Theorem, Theorem 4.15. The 
first of the problems gives a direct proof, without using the Gelfand—Mazur Theorem, 
that every multiplicative linear functional is continuous in the context of Theorem 
4.15. 


21. 


Let S be a nonempty set, and let A be a uniformly closed subalgebra of B(S) 
containing the constants and stable under complex conjugation. Let C be a 
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complex number with |C| > 1, let f be a member of A with || fl, < 1, and 


sup — 

let £ be a multiplicative linear functional on A. ° 

(a) Show that }°° 9(f/C)" converges and that its sum x provides an inverse to 
1 — (f/C) under multiplication. 

(b) By applying @ to the identity (1 — (f/C))x = 1, prove that €(f) = C is 
impossible. 

(c) Conclude from (b) that |||] < 1, hence that ¢ is automatically bounded. 


22. Let S be acompact Hausdorff space, and let £ be a multiplicative linear functional 
on C(S) such that ef) = ¢(f) for all f in C(S). Prove that ¢ is the evaluation 
és at some point s of S. 


23. Let S and T be two compact Hausdorff spaces, and let U : C(S) — C(T) be an 
algebra homomorphism that carries 1 to 1 and respects complex conjugation. 
(a) Prove that there exists a unique continuous map u : T — S such that 
(Uf )(t) = f(u(@)) for every t € T and f € C(S). 
(b) Prove that if U is one-one, then u is onto. 
(c) Prove that if U is an isomorphism, then u is a homeomorphism. 


24. Let X be acompact Hausdorff space, and let A and B be uniformly closed subal- 
gebras of B(X) containing the constants and stable under complex conjugation. 
Suppose that A C B. Suppose that S, p, U and T, q, V are data such that S and 
T are compact Hausdorff spaces, p : X — S andq : X — T are functions with 
dense image, and U : A > C(S) and V : B + C(T) are algebra isomorphisms 
carrying | to | and respecting complex conjugations such that for every x € X, 
(Uf)(p(x)) = x for all f € A and (Vg)(q(x)) = x for all g € B. Prove that 
there exists a unique continuous map ® : T —> S such that p = Pog. Prove 
also that this map satisfies (Uf)(®(t)) = (Vf)(t) for all f in A. 


25. Formulate and prove a uniqueness statement to complement the existence state- 
ment in Theorem 4.15. 


Problems 26-30 concern inductive limits. As mentioned in a footnote in the text, 
“direct limit” is a construction in category theory that is useful within several different 
settings. These problems concern the setting of topological spaces and continuous 
maps between them. For this setting a direct limit is something attached to a directed 
system of topological spaces and continuous maps. For the latter let (7, <) be a 
directed set, and suppose that W; is a topological space for each i in J. Suppose that 
a one-one continuous map w;; : W; — W; is defined whenever i < j, and suppose 
that these maps satisfy yj; = 1 and Wy; = We; o Wji Wheneveri < j <k. A direct 
limit of this directed system consists of a topological space W and continuous maps 
qi : W; — W for eachi in J satisfying the following universal mapping property: 
whenever continuous maps gy; : W; — Z are given for each i such that 9; 0 Wi = GY; 
fori < j, then there exists a unique continuous map ® : W — Zsuchthat g = Pog; 
for alli. 
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21. 


28. 


29. 


30. 
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Suppose that a directed system of topological spaces and continuous maps is 

given with notation as above. Let ||; W; denote the disjoint union of the spaces 

W;, topologized so that each W; appears as an open subset of the disjoint union. 

Define an equivalence relation ~ on | | W; as follows: if x; is in W; and x j is in 

W;, then x; ~ x; means that there is some k withi < k and j < & such that 

Wii (Xi) = Wej (Xj). 

(a) Prove that ~ is an equivalence relation. 

(b) Prove that elements x; in W; and x; in W; have x; ~ x; if and only if every 
Lwithi </ and j </ has Wi(xi) = Wj (x;). 

Define W to be the quotient | |; W; / ~, and give W the quotient topology. Let 

qa: []; Wi ~ W be the quotient map. Prove that W and the system of maps q w, 

form a direct limit of the given directed system. 


Prove that if (V, {p;}) and (W, {g;}) are two direct limits of the given system, 
then there exists a unique homeomorphism F : V — W such that gj = F o p; 
for alli in J. 


Suppose that each map wy; : W; — W; isa homeomorphism onto an open subset. 

(a) Prove that the quotient map q : ||; Wi — W carries open sets to open sets. 

(b) Prove that the direct limit W is Hausdorff if each given W; is Hausdorff. 

(c) Prove that the direct limit W is locally compact Hausdorff if each W; is 
locally compact Hausdorff. 

(d) Give an example in which each W; is compact Hausdorff but the direct limit 
W is not compact. 


Let J be anonempty index set, and let Sp be a finite subset. Suppose that a locally 
compact Hausdorff space X; is given for each i € J and that a compact open 
subset K; is specified for eachi ¢ Sg. For each finite subset S of J containing 
So, define 


X(S) = (X jesXi) x (X igsKi), 


giving it the product topology. If S; and S> are two finite subsets of J containing 
So such that S$; C So, then the inclusion wWs,s, : X(S1) — X(S2) is a homeo- 
morphism onto an open set, and these homeomorphisms are compatible under 
composition. The resulting direct limit X is called the restricted direct product 
of the X;’s with respect to the K;’s. Prove that X is locally compact Hausdorff 
and that elements of X may be regarded as tuples (x;) for which x; is in X; for 
alli while x; is in K; for all but finitely many i. 


CHAPTER V 


Distributions 


Abstract. This chapter makes a detailed study of distributions, which are continuous linear func- 
tionals on vector spaces of smooth scalar-valued functions. The three spaces of smooth functions 
that are studied are the space CoS.,(U) of smooth functions with compact support in an open set 
U, the space C®(U) of all smooth functions on U, and the space of Schwartz functions S(R) on 
RN. The corresponding spaces of continuous linear functionals are denoted by D/(U), €'(U), and 
S'(R%). 

Section 1 examines the inclusions among the spaces of smooth functions and obtains the conclu- 
sion that the corresponding restriction mappings on distributions are one-one. It extends from €’(U) 
to D’'(U) the definition given earlier for support, it shows that the only distributions of compact 
support in U are the ones that act continuously on C®(U), it gives a formula for these in terms of 
derivatives and compactly supported complex Borel measures, and it concludes with a discussion of 
operations on smooth functions. 

Sections 2-3 introduce operations on distributions and study properties of these operations. 
Section 2 briefly discusses distributions given by functions, and it goes on to work with multiplications 
by smooth functions, iterated partial derivatives, linear partial differential operators with smooth 
coefficients, and the operation (-)Y corresponding to x + —x. Section 3 discusses convolution at 
length. Three techniques are used—the realization of distributions of compact support in terms of 
derivatives of complex measures, an interchange-of-limits result for differentiation in one variable 
and integration in another, and a device for localizing general distributions to distributions of compact 
support. 

Section 4 reviews the operation of the Fourier transform on tempered distributions; this was 
introduced in Chapter III. The two main results are that the Fourier transform of a distribution 
of compact support is a smooth function whose derivatives have at most polynomial growth and 
that the convolution of a distribution of compact support and a tempered distribution is a tempered 
distribution whose Fourier transform is the product of the two Fourier transforms. 

Section 5 establishes a fundamental solution for the Laplacian in RN for N > 2 and concludes 
with an existence theorem for distribution solutions to Au = f when f is any distribution of compact 
support. 


1. Continuity on Spaces of Smooth Functions 


Distributions are continuous linear functionals on vector spaces of smooth func- 
tions. Their properties are deceptively simple-looking and enormously helpful. 
Some of their power is hidden in various interchanges of limits that need to be 
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carried out to establish their basic properties. The result is a theory that is easy to 
implement and that yields results quickly. In the last section of this chapter, we 
shall see an example of this phenomenon when we show how it gives information 
about solutions of partial differential equations involving the Laplacian. 

The three vector spaces of scalar-valued smooth functions that we shall con- 
sider in the text! of this chapter are C°(U), S(R™), and C Som(U ), where U is a 
nonempty open set in R” . Topologies for these spaces were introduced in Section 
IV.2, Section HI.1, and Section I'V.7, respectively. Let {K,} be an exhausting 
sequence of compact subsets of U, i.e., a sequence such that K, C K “ 44 for all 
p and such that U = U2, Kp. 

The vector space C®(U) of all smooth functions on U is given by a separating 
family of seminorms such that a countable subfamily suffices. The members of 
the subfamily may be taken to be || f lb = SUPy ex, |D°f (x)|, where 1 < p < co 


and where a varies over all differentiation multi-indices.” The space of continuous 
linear functionals is denoted by €'(U), and the members of this space are called 
“distributions of compact support” for reasons that we recall in a moment. 

The vector space S(R™) of all Schwartz functions is another space given by 
a separating family of seminorms such that a countable subfamily suffices. The 
members of the subfamily may be taken to be If llo.p = SUP, RN |x? Df (x)|, 


where @ and £ vary over all differentiation multi-indices.> The space of contin- 
uous linear functionals is denoted by S’(U), and the members of this space are 
called “tempered distributions.” 

The vector space C$<,(U) of all smooth functions of compact support on U 
is given by the inductive limit topology obtained from the vector subspaces C Ke 
The space C K, consists of the smooth functions with support contained in K,,, the 
topology on C Kp being given by the countable family of seminorms || fl. = 
SUPxex, |D® f(x)|. The space of continuous linear functionals is traditionally* 
written D'(U), and the members of this space are called simply “distributions.” 
Since the field of scalars is a locally convex topological vector space, Proposition 
4.29 shows that the members of D'(U) may be viewed as arbitrary sequences of 
consistently defined continuous linear functionals on the spaces C Rp" 


'A fourth space, the space of periodic smooth functions on RY , is considered in Problems 12-19 
at the end of the chapter and again in the problems at the end of Chapter VII. 

The notation for the seminorms in Chapter IV was chosen for the entire separating subfamily 
and amounted to || fll x De The subscripts have been simplified to take into account the nature of 
the countable subfamily. 

3The notation for the seminorms in Chapter III was chosen for the entire separating subfamily 
and amounted to || f Ila g- The subscripts have been simplified to take into account the nature of 
the countable subfamily. 

“The tradition dates back to Laurent Schwartz’s work, in which D(U) was the notation for 
Co,(U) and D’(U) denoted the space of continuous linear functionals. 
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For the spaces of smooth functions, there are continuous inclusions 


Cr AUC dm) for all U, 


ce Ry) oc sR) ocr”) = forU =R". 


We observed in Section IV.2 that Cg. (U) © C°(U) has dense image. Proposi- 
tion 4.12 showed that C& _(R%) C S(R%) has dense image, and it follows that 
S(R”) C C®(R) has dense image. 

If i : A — B denotes one of these inclusions and T is a continuous linear 
functional on B, then T 07 is a continuous linear functional on A, and we can 
regard T oi as the restriction of T to A. Since i has dense image, T oi cannot 
be 0 unless T is 0. Thus each restriction map T +» T oi as above is one-one. 


We therefore have one-one restriction maps 
E'(U) > D'(U) for all U, 
E(RY) > SRY) > DRY) ~~ forU =R". 


This fact justifies using the term “distribution” for any member of D’ and for 
using the term “distribution” with an appropriate modifier for members of €’ and 
S’. 

As in Section III.1 it will turn out often to be useful to write the effect of a 
distribution T on a function ¢ as (T, g), rather than as T (g), and we shall adhere 
to this convention systematically for the moment.> 

We introduced in Section IV.2 the notion of “support” for any member of €’(U), 
and we now extend that discussion to D’(U). We saw in Proposition 4.10 that if 
T is an arbitrary linear functional on Coo, (U) and if U’ is the union of all open 


com 


subsets U,, of U such that T vanishes on C,,(U,), then T vanishes on C3, (U’). 
We accordingly define the support of any distribution to be the complement in 
U of the union of all open sets U,, such that T vanishes on CE. (U,,). If T has 
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empty support, then 7 = 0 because T vanishes on CS, (U) and because CS, (U) 
is dense in the domain of T. Proposition 4.11 showed that the members of €'(U) 
have compact support in this sense; we shall see in Theorem 5.1 that no other 
members of D’(U) have compact support. 

An example of a member of €’(U) was given in Section IV.2: Take finitely 
many complex Borel measures , of compact support within U , the indexing be- 
ing by multi-indices a with |w| < m,and put (T, g) = are es D° p(x) dpa(x). 
Then T is in €'(U), and the support of T is contained in the union of the supports 
of the p,’s. Theorem 5.1 below gives a converse, but it is necessary in general 
to allow the p,’s to have support a little larger than the support of the given 
distribution T . 

5A different convention is to write S y (x) aT (x) in place of (T, g). This notation emphasizes 


an analogy between distributions and measures and is especially useful when more than one R 
variable is in play. This convention will provide helpful motivation in one spot in Section 3. 


182 V. Distributions 


Theorem 5.1. If T is a member of D’(U) with support contained in a compact 
subset K of U, then T is in €’(U). Moreover, if K’ is any compact subset of 
U whose interior contains K, then there exist a positive integer m and, for each 
multi-index a with |w| < m,a complex Borel measure p, supported in K’ such 
that 

(Tos So / D°gdp, — forallg € C®(U). 


|a|<m 


REMARK. Problems 8-10 at the end of the chapter discuss the question of 
taking K’ = K under additional hypotheses. 


PROOF. Let yw be a member of C&,,(U) with values in [0, 1] that is 1 ona 
neighborhood of K and is 0 on K“°; such a function exists by Proposition 3.5f. 
If g is in C3,(U), then we can write gp = yo + (1 — w)g with wo in C& 


and with (1 — yy) in Cgo.,(K°). The assumption about the support of T makes 
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(T, 1 — &)¢) = 0, and therefore 


(T, 9) = (T, We) +(T, A— Wg) = (T, we) forallginCy,(U).  (*) 


Since the inclusion C?; + C&,,(U) is continuous, we can define a continuous 


linear functional T; on C% by 7, (¢) = (T, d) for dinCy. Forany ginCS(U), 


@ = w¢ is in CZ, and (x) gives (T, ~) = (T, We) = T(W¢). The continuity 
of T; on C means that there exist m and C such that 


IT1(@)| <C dO sup |D%g(x)| forall PE CR. (x) 


ja|<m xeK’ 


Let M be the number of multi-indices aw with |a| < m. 

We introduce the Banach space X of M-tuples of continuous complex-valued 
functions on K’, the norm for X being the largest of the norms of the components. 
The Banach-space dual of this space is the space of M-tuples of continuous linear 
functionals on the components, thus the space of M-tuples of complex Borel 
measures on K’. 

We can embed Cz as a vector subspace of X by mapping ¢ to the M-tuple with 
components D*@ for |a| < m. We transfer T; from C% to its image subspace 
within X, and the result, which we still call 7), is a linear functional continuous 
relative to the norm on X as a consequence of (**). Applying the Hahn—Banach 
Theorem, we extend 7 to a continuous linear functional 7; on all of X without 
an increase in norm. Then 7; is given on X by an M-tuple of complex Borel 
measures /, on K’,i.e., Ti ({ fa}laj<m) = Dine iM fu dp),. Therefore any ¢ in 
Com (U ) has 


(T, 9) = Ti(We) = Ti({D°P)}a1<m) =o Se D8Wy) dp.) 
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The right side of (+) is continuous on C~(U), and therefore T extends to a 
member of €’(U). The formula in the theorem follows by expanding out each 
D*(w¢@) in (}) by the Leibniz rule for differentiation of products, grouping the 
derivatives of y with the complex measures, and reassembling the expression 
with new complex measures py. 


In Chapters VII and VIII we shall be interested also in a notion related to 
support, namely the notion of “singular support.” If f is a locally integrable 
function on the open set U, then f defines a member T; of D’(U) by 


(Ty, Q) =f fodx forg €Co(U). 
U 


If U’ is an open subset of U and T is a distribution on U, we say that T equals 
a locally integrable function on U’ if there is some locally integrable function 
f on U’ such that (7, y) = (Ty, ¢) for all y in C&S, (U). We say that T equals 
a smooth function on U’ if this condition is satisfied for some f in C®(U’). In 
the latter case the member of C®(U’) is certainly unique. 

The singular support of a member T of D’(U) is the complement of the 
union of all open subsets U’ of U such that T equals a smooth function on U’. 
The uniqueness of the smooth function on such a subset implies that if T equals 
the smooth function f; on U i and equals the smooth function f2 on U3, then 


fi(x) = fo(x) for x in U; N U5. In fact, T equals the smooth function f; eee 
1 2 


there. The uniqueness 


on U;  U; and also equals the smooth function fo] )7. 7); 
1 2 


a Taking the union of all the open subsets on which T 


forces f; uinu! = fo 
equals a smooth function, we see that T is a smooth function on the complement 


of its singular support. 


EXAMPLE. Take U = R!, and define 


a 
Gosin | £2 pepeco mi: 
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To see that this is well defined, we choose n in C& (R!) with n identically 1 
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on the support of g and with n(x) = n(—x) for all x. Taylor’s Theorem gives 
g(x) = (0) + xR(x) with R in C~(R!). Multiplying by n(x) and integrating 
for |x| > €, we obtain 


nee ed = (0) Ties no) ds ap i R(x) n(x) dx. 
The first term on the right side is 0 for every ¢, and therefore 


(T, 0) = fa: R&)n(x) dx. 
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It follows that T is in D’(R'). On any function compactly supported in R! — {0}, 
the original integral defining T is convergent. Thus T equals the function 1/x 
on R! — {0}. Since 1/x is nowhere zero on R! — {0}, the (ordinary) support of 
T has to be a closed subset of R! containing R! — {0}. Therefore T has support 
R!. On the other hand, 7 does not equal a function on all of R', and T has {0} 
as its singular support. 


Starting in Section 2, we shall examine various operations on distributions. 
Operations on distributions will be defined by duality from corresponding opera- 
tions on smooth functions. For that reason it is helpful to know about continuity 
of various operations on spaces of smooth functions. These we study now. 

We begin with multiplication by smooth functions and with differentiation. If 
w isinC™(U), then multiplication g > y¢ carries CZ, (U) into itself and also 
C™(U) into itself. The same is true of any iterated partial derivative operator 
gy +» D*%@. We shall show that these operations are continuous. A multiplication 
¢~ +> W¢@ need not carry S(R”) into itself, and we put aside S(R”) for further 
consideration later. 

The kind of continuity result for C°(U) that we are studying tends to follow 
from an easy computation with seminorms, and it is often true that the same 


argument can be used to handle also C&,(U). Here is the general fact. 


Lemma 5.2. Suppose that L : C°(U) + C™(U) is a continuous linear map 
that carries CS,(U) into C&.,(U) in such a way that for each compact K CU, 
Cf is carried into C& for some compact K’ > K. Then L is continuous as a 


linear map from CS&°_(U) into C& (U). 
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PROOF. Proposition 4.29b shows that it is enough to prove for each K that 
the composition of L : Ce — Cx followed by the inclusion of CZ into 
Cxn(U) is continuous, and we know that the inclusion is continuous. Fix 
K, choose K, in the exhausting sequence containing the corresponding K’, 
and let a be a multi-index. By the continuity of L : C°(U) ~ C™(U), 
there exist a constant C, some integer gq with g > p, and finitely many multi- 
indices §; such that ILZ@llpo < C a IPllo.p,- Since L(g) has support in 
K' C K, and g has support in K C K' C Ky, C Kg, this inequality shows that 
sup, ex, |D*(L(g))(x)| < CO; supyex |D® g(x)|. Hence L : Cer —> CF is 
continuous, and the lemma follows. 


Proposition 5.3. If y isin C~(U), then g > wg is continuous from C°(U) 
to C*(U) and from Cx, (U) to CS,(U). If a is any differentiation multi-index, 


then g +> D*%¢@ is continuous from C*(U) to C°(U) and from Cx (U) to 
CSihl be 
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PRooF. The Leibniz rule for differentiation of products gives D°(Ww@) = 
ee CB (D-“wW) (DF 9) for certain integers cg. Then 


IV lly < Ceca complloll, p: 


where mg = SUP, < K, |D®-“yW(x)|, and it follows that g ++ w¢@ is continuous 
from C™(U) into itself. Taking K’ = K in Lemma 5.2, we see that g > wo is 
continuous from CS, (U) into itself. 

Since |D° ll... = Pll pap? the function g + Dg is continuous from 
C®(U) into itself, and Lemma 5.2 with K' = K shows that gy > D*%@ is 
continuous from CS, (U) into itself. 

We can combine these two operations into the operation of a linear partial 
differential operator 


P(x, D) = ‘S Cy(x)D® with all cy in C°(U) 


|o|<m 


by means of the formula P(x, D)g = ae Cq(x)D%q. It is to be understood 
that the operator has smooth coefficients. It is immediate from Proposition 5.3 
that P(x, D) is continuous from C®(U) into itself and from Cx, (U) into itself. 

An operator P(x, D) as above is said to be of order m if some cy(x) with 
|~| = m has cy not identically 0. The operator reduces to an operator of the form 
P(D) if the coefficient functions cy are all constant functions. 


We introduce the transpose operator P(x, D)" by the formula 


P(x, D)"9(x) = Yo (-D"!D*(ca(x)9(@)). 


|a|<m 


Expanding out the terms D® (ey (x)(x)) by means of the Leibniz rule, we see 
that P(x, D)" is some linear partial differential operator of the form Q(x, D). 
The next proposition gives the crucial property of the transpose operator. 


Proposition 5.4. Suppose that P(x, D) is a linear partial differential operator 


on U. If u and v are in C®(U) and at least one of them is in C&S, (U), then 


/ (Pox, Dy"u())v(xy dx = | u(x)(P(x, D)v(x)) dx. 
U U 


PROOF. It is enough to prove that the partial derivative operator D; with respect 
to x; satisfies f,, (Dju)udx = — f,, u(Djv) dx since iteration of this formula 
gives the result of the proposition. Moving everything to one side of the equation 
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and putting w = wv, we see that it is enough to prove that /. Jy Djwdx = 0 
if w is in C&,(U), where Iy is the indicator function of U. We can drop the 
Ty from the integration since D;w is 0 off U, and thus it is enough to prove that 
Jpn Djw dx = 0 for w in C&,,(R"). By Fubini’s Theorem the integral may be 
computed as an iterated integral. The integral on the inside extends over the set 
where x; is arbitrary in R and the other variables take on particular values, say 
x; = c; fori # j. The integral on the outside extends over all choices of the c; 
for i # j. The inside integral is already 0, because for suitable a and Jb, it is of 


the form hs Djw dx; = [whim =0-0=0. 


Next let us consider convolution, taking U = IR. We shall be interested in 
the function y *« @ given by 


Wy * 9(x) = fon Wx — yoy) dy = fav VI)e@ — y) dy, 


under the assumption that y and g are in C~(R%) and that one of them has 
compact support. 

A simple device of localization helps with the analysis of this function: If K 
is the support of w, then the values of yw « g(x) for x in a bounded open set S$ 
depend only on the value of g on the bounded open set of differences S — K. 
Consequently we can replace y by ng, where 7 is a member of C&_,(R”) that 
is 1 on S — K, and the values of w * g(x) will match those of yw « (n@)(x) for x 
in S. The latter function is the convolution of two smooth functions of compact 
support and is smooth by Proposition 3.5c. Therefore y «9 is always in C® (RY) 
if y isin Ce, (R™) and ¢ is in C°(R*). We shall use this same device later in 
treating convolution of distributions. 


Proposition 5.5. If y is in C&_(R”) and g is in C®~ (RY), then 


com 
(a) D*(W xg) = (D°W) *9 = * (D" 9), 
(b) convolution of three functions in C®(R") is associative when at least 
two of the three functions have compact support, 
(c) convolution with y is continuous from C®(R") into itself and from 
C& (RY) into itself, 
(d) convolution with g is continuous from C&°,,(R”) into C~ (RY). 
PROOF. For (a), let K be the support of yw. Concentrating on x’s lying in a 


bounded open set S, choose a function 7 in C&°., (RY) that is 1 on S — K, and 


then W *« g(x) = Ww * (ng) (x) for x in S. Proposition 3.5c says that 
D* (Ww * (ng))(x) = (D*w) * (ne)(x) = w * D* (ng) (x) 


for all x in R" , and consequently 


D° (W * g)(x) = (D°p) * (x) = H * D° G(x) 
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for all x in S. Since S is arbitrary, (a) follows. The proof of (b) is similar. 
For (c), again let K be the support of yw, and apply (a). Then 


I * Pllpg = uP |D°(w * g)(x)| = sup |W * (Dg) (x)| 


xeEKy 


< sup fx IW(y)IID% 9 — yl dy| < Ill, sup.ex,—-« ID°9@I, 


xeKy 


and the right side is < Wel Pllc if q is large enough so that K, — K C Ky. 
This proves the continuity on C°(R”), and the continuity on C&,(R”) then 
follows from Lemma 5.2. 

For (d), Proposition 4.29b shows that it is enough to prove that y b> y * @ is 
continuous from C® into C~®(R) for each compact set K. The same estimate 
as for (c) gives 


IY * Pllpg SMM NGllge < IKINGI Go (sup ly @)) 
xeK 


if g is large enough so that K, — K C Kg. The result follows. 


2. Elementary Operations on Distributions 


In this section we take up operations on distributions. If f is a locally integrable 
function on the open set U, we defined the member Ty of D'(U) by 


(Tr, 9) = fodx 
U 


oe) 


for g in CX, (U). If f vanishes outside a compact subset of U, then Ty is in 
E'(U), extending to operate on all of C°(U) by the same formula. 

Starting from certain continuous operations L on smooth functions, we want 
to extend these operations to operations on distributions. So that we can regard 
L as an extension from smooth functions to distributions, we insist on having 
L(Ty) = Ty,f) if f is smooth. To tie the definition of L on distributions T; to the 
definition on general distributions T , we insist that L be the “transpose” of some 
continuous operation M on functions, i.e., that (L(T), g) = (T, M(g)). Taking 
T = T; inthis equation, we see that we must have /,, L(f)ydx = Jy, fM(g) dx. 
On the other hand, once we have found a continuous M on smooth functions with 
Jy Ledx = fy fM(@) dx, then we can make the definition (L(T), g) = 
(T, M(@)) for the effect of L on distributions. In particular the operator M on 
smooth functions is unique if it exists. We write L" = M for it. In summary, our 
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procedure® is to find, if we can, a continuous operator L" on smooth functions 
such that 


[ enwar = | fL"(g) dx 
U U 


and then to define 
(L(T), g) = (T, L"(g)). 


We begin with the operations of multiplication, whose continuity is addressed 
in Proposition 5.3. If L is multiplication by the function yw in C°(U), then 
we can take L" = L because f,, L(f)ydx = fy(Wwf)gedx = fy f(we) dx = 
hes f L"(g) if f and g are in C®(U) and one of them has compact support. Thus 
our definition of multiplication of a distribution T by w in C~(U) is 


(WT, ¢) = (T, ve). 


Here we assume either that T is in D'(U) and ¢ is in C&S. (U) or else that T is 
in €’(U) and g is in C™(U). Briefly we say that at least one of T and gy has 
compact support. 

The operation of multiplication by a function can be used to localize the effect 
of a distribution in a way that is useful in the definition below of convolution 
of distributions. First observe that if T is in D’(U) and n is in CS,(U), then 
the support of nT is contained in the support of 7; in fact, if g is any member 
of C&,(U M support(7)°), then ng = O and hence (nT, vy) = (T, ng) = 0. In 
particular, 77 is in €'(U). On the other hand, we lose no information about T 
by this operation if we allow all possible 7’s, because if T is in D'(U) and if y 
is amember of C&_(U) with support in a compact subset K of U, then g = no 
and hence (T, ¢) = (T, ng) = (nT, ¢). 

Next we consider differentiation, which is a continuous operation by Proposi- 
tion 5.3. When L gives the iterated derivative D® of a distribution, we can take 
the operation L™ on smooth functions to be (—1)'*! times D®. The definition is 
then 


(D°T, g) = (-L)(T, D%@). 


Again we assume that at least one of T and g has compact support. 

Putting these definitions together yields the definition of the operation of a lin- 
ear partial differential operator P (x, D) with smooth coefficients on distributions. 
The formula is 

(P(x, D)T, y) = (T, P(x, D)"9), 
® Another way of proceeding is to use topologies on €’(U) and D’(U) such that C&°, (U) is dense 


com 
in €’(U) and C™(U) is dense in D’(U). The approach in the text avoids the use of such topologies 
on spaces of distributions, and it will not be necessary to consider them. 


3. Convolution of Distributions 189 


where P(x, D)" is the transpose differential operator defined in Section 1. This 
definition is forced to satisfy P(x, D)T = Tpc.,p) ¢ on smooth f/f. 

For further operations let us specialize to the setting that U = R%. The first 
is the operation of acting by —1 in the domain. For a function g, we define 
gY (x) = g(—x). It is easy to check that this operation is continuous on C® (R™ ) 
andonC&. (RY). Since fey fY pdx = fv fyY dx by achange of variables, the 
operator L" corresponding to L(f) = f¥ is just L itself. Thus the corresponding 
operation T +> TY on distributions is given by 


(TY, 9) = (T, @”). 
The operation (-)Y has the further property that (pY)Y = g and (TY)Y =T. 


3. Convolution of Distributions 


The next operation, again in the setting of R%, is the convolution of two dis- 
tributions. Convolution is considerably more complicated than the operations 
considered so far because it involves two variables. 

The method of Section 2 starts off easily enough. An easy change of variables 
shows that any three smooth functions, two of which have compact support, 
satisfy [pw (Wx fipdx = fav (W)(fY * y)dx, where fY(—x) = f(—x). 
This means that fey L(W)ydx = fv WL"(y) dx, where L(w) = w * f and 
L" (gy) = f’ xg. Thus Section 2 says to define T « f by (T * f, ) = (T, f’ *@). 
To handle the other convolution variable, however, we have to know that T « f 
is a smooth function and that the passage from f to T « f is continuous, and 
neither of these facts is immediately apparent. In addition, there are several cases 
to handle, depending on which two of the functions f, y, and ¢ at the start have 
compact support. 

Sorting out all these matters could be fairly tedious, but there is a model for 
what happens that will help us anticipate the results. We shall follow the path 
that the model suggests. Then afterward, if we were to want to do so, it would 
be possible to go back and see that all the arguments with transposes in the style 
of Section 2 can be carried through with the tools that we have had to establish 
anyway. 

The model takes a cue from Theorem 5.1, which says that members of €’(R”) 
are given by integration with compactly supported complex Borel measures and 
derivatives of them. In particular our definitions ought to specialize to famil- 
iar constructions when they are given by compactly supported positive Borel 
measures. In the case of measures, convolution is discussed in Problem 5 of 
Chapter VIII of Basic. The definition and results are as follows: 


(i) (41 * M2)(E) = fon M1 (E — x) dt2(x) by definition, 
(ii) Jpn gd(uy* U2) = Jpn Jpn g(x +y)dui(x) du2(y) for g € Ccom(R™), 
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(iii) Wy * M2 = M2 * 1, 

(iv) gdxx*is the continuous function (g dx *j1)(x) = Jpn g(x—y)duQy) = 
Jpn (QY)—-x du for g € Ccom(R), where the subscript —x refers to the 
translate h;(y) = h(y +1). 


The measures and the function g in these properties are all assumed compactly 
supported, but some relaxation of this condition is permissible. For example the 
function g can be allowed to be any continuous scalar-valued function on RY. 

In defining convolution of distributions and establishing its properties, we shall 
face three kinds of technical problems: One is akin to Fubini’s Theorem and will 
be handled for €’(IR) by appealing to Theorem 5.1 and using the ordinary form 
of Fubini’s Theorem with measures. A second is a regularity question —showing 
that certain integrations in one variable of functions of two variables lead to 
smooth functions of the remaining variable —and will be handled for €’(R“) by 
Lemma 5.6 below. A third is the need to work with D’(R¥), not just €’(R), 
and will be handled by the localization device T +> nT mentioned in Section 2. 
We begin with the lemma that addresses the regularity question. 


Lemma 5.6. Let K be a compact metric space, and let jz be a Borel measure 
on K. Suppose that ® = ®(x, y) is a scalar-valued function on RY x K such 
that ®(-, y) is smooth for each y in K, and suppose further that every iterated 
partial derivative D®® in the first variable is continuous on RY x K. Then the 
function 


F(x) = @(x, y)du(y) 
K 


is smooth on R% and satisfies D“ F(x) = ihe DY P(x, y)du(y) for every multi- 
index a. 


REMARKS. The lemma gives us a new proof of the smoothness shown in 
Section | for y * gy when w is in C&(R%) and g is in C*(RY). In fact, 
we write the convolution as y * g(x) = fen g(x — y)W(y) dy and apply the 
lemma with jz equal to Lebesgue measure on the compact set support(y) and 


with F(x) = w * g(x) and ®(x, y) = g(x — y)WG). 


PROOF. In the proof we may assume without loss of generality that ® is real- 
valued. We begin by showing that F is continuous. If x, — xo, then the uniform 
continuity of ® on the compact set {Xn}n>0 x K implies that lim, ®(x,, y) = 
®(xo, y) uniformly. Dominated convergence allows us to conclude that 
lim, fi Pn, y) duly) = fx P(xX0, y) du(y). Therefore F is continuous. 

Let B be a (large) closed ball in R% , and suppose that x is a member of B that 
is at distance at least | from B°. If e; denotes the j™ standard basis vector of R 
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and if |h| < 1, then the Mean Value Theorem gives 
@(x + hej, y) — (x, y) a® 
= (c, y) 
h Ox; 


for some c on the line segment between x and x +h. If € > O is given, choose 


the 5 of uniform continuity of ge on the compact set B x K. We may assume 
that 6 < 1. For |h| < 6 and for y in K, we have 
O(x +he;,y)— P(x, y) ~a® a® o® 
ul (x, »)| =|—@ ») - —@,y) <€, 
h Ox; Ox; Ox; 


the inequality holding since (c, y) and (x, y) are both in B x K and are at distance 
at most 5 from one another. As a consequence, if L is any compact subset of RY, 
then 
_ Ox +he;,y)— &(x, y) a® 
lim = —(x,y) 

h>0 h Ox; 
uniformly for (x, y) in L x K. Because of this uniform convergence we have 


OG hea Ou, ao 
iim | eG) PEN) any) = | — (x, y) du(y). 
h>0 JK h K Ox; 


The integral on the left side equals ho [F(a + he;, y) — F(x, y)], and the 
limit relation therefore shows that ao J, x P(x, y)du(y) exists and equals 
J 


Sic ie YdnO). 

This establishes the formula D® F(x) = Ie DY P(x, y)du(y) for a equal to 
the multi-index that is 1 in the j place and 0 elsewhere. The remainder of the 
proof makes the above argument into an induction. If we have established the 
formula D® F(x) = Hp x DI P(x, y) du(y) for a certain a, then the first paragraph 
of the proof shows that D“ F is continuous. The second paragraph of the proof 
shows for each partial derivative operator D; in one of the x variables that the 


operator DP = D,D* has DP F(x) = Se D’ a(x, y) du(y). The lemma follows. 


For our definitions let us begin with the convolution of two members of €’(R”). 
As indicated at the start of the section, we shall jump right to the final formula. 
The justification via formulas for transpose operations can be done afterward if 
desired. If we use notation that treats distributions like measures, the formula (ii) 
above suggests trying 


(S*T, 9) = fon fan 9 + y) dT (y) dS(x) = (S, (T, Gx)) = (T, (S, gy)), 
where the subscript again indicates a translation: g,(z) = g(z +x). The outside 
distribution acts on the subscripted variable, and the inside distribution acts on 
the hidden variable. To make this into a rigorous definition, however, we have 
to check that (7, g,) and (S, gy) are smooth, that the last equality in the above 
display is valid, and that the resulting dependence on ¢ is continuous. We carry 
out these steps in the next proposition. 
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Proposition 5.7. Let S and T be in €'(R"), and let g be in C~(R). Then 

(a) the functions x +> (T, g,) and y +> (S, gy) are smooth on RY, 

(b) D&(x + (T, o)) = (T, (D"@)x), 

(c) the function g + (T, x) is continuous from C°(R”) into itself and 
from C&,, (RY) into itself, 

(d) (S,(T, Gx)) = (T, (S, @y)), 

(e) the function g +> (S,(T,,)) is continuous from C°(R”) into the 
scalars, 

(f) the formula 


(S* 7,9) = (S,(T, @x)) = (T, (S, Py)) 


determines a well-defined member of €’ (R* ) such that Sx T = 7T * S, 
(g) the supports of S, 7, and S * T are related by 


support(S « J) C support(S) + support(T). 


PROOF. Let expressions for S and T in Theorem 5.1 be 


(S,¢) = ro fan D*9(x) dpa(x) and (T,~) = Dg fan DP p(y) dogiy), 


the sums both being over finite sets of multi-indices and the complex measures 
being supported on some compact subset of R”. Then 


(T, Gx) = Dp Sgn DPQ + y) dogly). (*) 


If we apply Lemma 5.6 with (x, y) = D’y(x + y) and treat y as varying over 
the union of the compact supports of the og’s, then we see that each term in 
the sum over # is a smooth function of x. Hence x +> (T, g,) is smooth, and 
symmetrically y +> (S, gy) is smooth. This proves (a). 

Applying to (*) the conclusions of Lemma 5.6 about passing the derivative 
operator D®% under the integral sign, we obtain 


D%(x +> (T, @)) = Vg fan D*P p(x + y) dop(y) = (T, (D%)x). 


This proves (b). 
If K denotes a subset of R™ containing the supports of all the og’s, then 


|D°(T, gx)| < © sup |D*t¥ g(x + y)Illog ll, 
K 


B ye 
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where ||og|| denotes the total-variation norm of og. Hence 


sup|D*(T, ¢x)| < 2 sup |D**? g(z)Illogll. 
xeL B zeK+L 


This proves (c) for C~(R™). Combining this same inequality with Lemma 5.2, 
we obtain (c) for C&,(R). 


The formula for (S, -) and the identity («) together give 
(S(T, Gr) =X Jn Sqn D* DP yxy) dop(y) dpalx) 
a,B 


=> few few DP p(« + y)dop(y)dpalx). ——(#X) 
ap 


By Fubini’s Theorem the right side is equal to 


> Jaw Saw D°*P g(x + y) dpa (x) dag(y) = (T, (S, gy). 


This proves (d). 

Conclusion (e) is immediate from (c) and the continuity of § on C~(R™). 
Thus S * T is in €’(R™). The equality in (d) shows that S « T = T « S. This 
proves (f). 

Finally let L be the compact set support(S) + support(7), and suppose that @ 
is in Co, (L°). Let d > 0 be the distance from support(¢) to L, and let D be the 
function giving the distance to a set. Define 


Ls = {x | D(x, support(S)} < 3d 
and Lr = {x | D(x, support(T)} < sd. 


If xs is in Lg and x7 is in L7, then |xs — s| < id and |xr —t| < id for some 
s in support(S) and ¢ in support(T). Thus |(xvs + xr) — (s + 1t)| < id. Hence 
Xs + xr is at distance < id from L. Since every member of support(¢) is at 
distance > d from L, xs + xr is not in support(g). Therefore 


(Ls + Lr) Nsupport(g) = 2S. (+) 


Also, support(S) € (L5)° and support(T) € (L7)°. Since Ls contains a neigh- 
borhood of support(S), Theorem 5.1 allows us to express S$ in terms of complex 
Borel measures py supported in Ls. Similarly we can express T in terms of 
complex Borel measures og supported in L. By (+) the integrand in (++) is iden- 
tically Oon Ls+Lr7, and hence (S, (T, g,)) = 0. Thus (S*7, g) = 0 for all g in 
Co _(L*), and we conclude that support(S*T) C L = support(S)+support(T ). 


This proves (g). 
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Proposition 5.7 establishes facts about the convolution of two members of 
€'(R) as a member of €’(IR”). If one of the two members is in fact a smooth 
function of compact support, then the corresponding results about convolution of 
measures suggest that the convolution should be a smooth function. The necessary 
tools for carrying out a proof are already in place in Proposition 5.7 and Theorem 
5.1. 


Corollary 5.8. If S isin €’(R), f isinC®,(R%), and g is in C®(R"), then 


com 
(S* Tp, p) = (S, fY * 9). 
Moreover, S * Ty is given by the C™ function y > (S, fo Ja5) Bes 
S«Ty = Tp with F(y) = (S, (f”)-_y). 


REMARKS. For S§ in €/(R%) and f in CX (R%), we write S « f for the 


com 


C&,(R”) function F of the corollary such that S * Ty = Tp. The specific 
formula that we shall use to simplify notation is 


Sx Ty => Ts fs 


with the right side written as Ts, ¢ rather than Ts,7,. 


PROOF. Proposition 5.7f gives 


(S* Ty, g) = (S, (Tr, @)) = (S, fow OPO + y) dy) 


(x) 
=(S, fav f(—y)oa — y) dy) = (S, fY ¥ 9). 


This proves the first displayed formula. For the rest let S be written according to 
Theorem 5.1 as (S, ¥) = 0, gw D°W dpq. Then 


(S. FY * 9) = Da San DXF” * 9) dpala) 
= Ya Sgn (D8FY * Ya) dpa) 
= Ya Jew Sgn DOF’ & — y) GO) dy dpa(x) 
= fan [Doe Sgn (D°FY)-y doa(x)] 9) dy 
= Jpn (S, (f")-y)@O) dy, 
the next-to-last equality following from Fubini’s Theorem. Combining this cal- 


culation with («), we see that S *« Ty = Tr with F(y) = (S,(f”)-y). The 
function F is smooth by Proposition 5.7a. 
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Corollary 5.9. Convolution of members of €’(R) is consistent with convo- 
lution of members of C&°,,(IR”) in the sense that if f and g are inC&,(R¥), then 
T, * Ty is given by the C™ function T, *« f, and this function equals g * f. 

PROOF. The first conclusion is the result of Corollary 5.8 with S$ = Ty. 
For the second conclusion Corollary 5.8 gives T, * Ty = Tr with F(y) = 
(Ty, (f%)-y) = fan BQOFY @ — y) dx = fon 8@) f(y — x) dy = (g * f)(). 
Hence Tr, = Tex, and the second conclusion follows. 

Corollary 5.10. If T is in €’(R%) and g is in C& (RY), then 


com 
(TY * g)(x) = (T, ¢x). 
PROOF. Corollary 5.8 gives (TY * g)(x) = (TY, (g’)_x), and the latter is 
equal to (7, ((g”)—»)”) = (T, x). 
Corollary 5.11. If S and T are in €’(R™) and g is in C&_,(R), then 
(S*T,o) = (S,TY *Q). 


PROOF. Proposition 5.7f and Corollary 5.10 give (S * T, g) = (S, (T, @)) = 
(S, TY *g). 


Corollary 5.12. If 7 is in €’(R”), then the map gy +> TY * g is continuous 
from C RY ) into itself and extends continuously to a map of C~(RY) into 


itself under the definition 


(TY * p)(x) = (T, Gx). 
The derivatives of TY * y satisfy D°(TY *p) = TY « D%y, and also (TY *g)Y = 
T *@”. 

PROOF. The equality (TY *« g)(x) = (T, gx) restates Corollary 5.10, and the 
statements about continuity follow from Proposition 5.7c. For the derivatives we 
use Proposition 5.7b to write D*(TY * g)(x) = D®(T, @,) = (T, (D%g)x) = 
(TY * D“g) (x). Finally (TY *p)Y (x) = (TY *@)(—x) = (T, @-x) = (TY, (@-x)”) 
= (TY, (gY)x) = (T * @Y) (a). 


Since TY * gy is now well defined for T in €’ and g in C®(R”), we can use 
the same formula as in Corollary 5.11 to make a definition of convolution of two 
arbitrary distributions when only one of the two distributions being convolved has 
compact support. Specifically if § is in D’(R%) and T is in €'(R%), we define 
S*T in D/(RY) by the first equality of 


(S*T,g) =(S,TY 9) =(S,(T,gx)) forge CS,(R”), 


com 
the second equality holding by Corollary 5.12. Corollary 5.12 shows also that 
ST has the necessary property of being continuous on C&,, (IR), and Corollary 


5.11 shows that this definition extends the definition of S « T when S and T are 
in E€/(RY). 
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What is missing with this definition of S « T is any additional relationship that 
arises for distributions that equal smooth functions. For example: 


e Does this new definition make Ty « T = Ty,¢ when T is compactly 
supported and f does not have compact support? 

e Is S* Ty equal to a function when f is compactly supported and S is not? 

e Ifso, are the formulas of Corollaries 5.8, 5.9, and 5.10 valid? 

e If so, can we equally well define § * T by (S * T,y) = (T, SY *g) = 
(T, (S, @y)) when T is compactly supported and S is not? 


The answers to these questions are all affirmative. To get at the proofs, we 
introduce a technique of localization for members of D’(R). Proposition 5.13 
below is a quantitative statement of what we need. We apply the technique to 
obtain smoothness of functions of the form (S, gy) when S is in D'(R") and 
g is in C&,(R%); this step does not make use of the above enlarged definition 
of S * T. Then we gradually make the connection with the new definition of 


convolution and establish all the desired properties. 


Proposition 5.13. Let N be abounded open set in RR”. Let S be inD’(R”), and 
let y be in C&, (RY). If n € CX (R”) is identically 1 on the set of differences 
support(y) — N, then (S, g,) = (nS, gy) for y in N. Consequently y +> (S, gy) 
is in C©(R%). Moreover, D*(y +> (S, gy)) = (S, (D%@)y), and the linear map 
p +> (S, gy) of CX (RY) into CR) is continuous. 

PROOF. Let y be in N. If x + y is in support(@), then x is in support(g) — N, 
and n(x) = 1. Hence n(x)g(x + y) = g(x + y). If x + y is not in support(¢), 
then n(x)p(x +y) = v(x + y) because both sides are 0. Hence ng, = @y for y in 
N, and (S, gy) = (S, npy) = (nS, gy). The function y +> (7S, gy) is smooth by 
Proposition 5.7a, and hence y +> (S, gy) is smooth on N. Since N is arbitrary, 
y + (S, gy) is smooth everywhere. 

For the derivative formula Proposition 5.7b gives us D*(y +> (nS, @y)) = 
(nS, (D%),) for y in N. For y in N, (nS, gy) = (S, gy) and (nS, (D%¢),) = 
(S, (D%@)y). Therefore D*(y t> (S, @y)) = (S, (D%¢)y) for y in N. Since N 
is arbitrary, D“(y +> (S, gy)) = (S, (D%@)y) everywhere. 

For the asserted continuity of g +> (S, gy), it is enough to prove that this map 
carries C% continuously into C°(R*) for each compact set K . If N is a bounded 
open set on which we are to make some C™ estimates, choose 7 € C oo (RN ) 
so as to be identically 1 on the set of differences K — N. We have just seen that 
(S, @y) = (nS, py) for all y in N. Proposition 5.7c shows that w b> (nS, py) 
is continuous from C& (RY) into C&(R%), hence from C% into C2 (R%), 


hence from CP into C (IR). Therefore g + (S, @y) is continuous from Cz 
into C°(RY). 
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Corollary 5.14. Let S be in D’(R”), T be in €’(R¥), and g be inC& (RY). 
Then 


(S*T,g) = (S, TY * g) = (S,(T, @x)) = (T, (S, Qy)). 
Moreover, D®(S * T) = (D°S) * T = S * (D°T) for every multi-index a. 


REMARKS. The first two equalities follow by definition of S * T and by 
application of Corollary 5.12. The new statements in the corollary are the third 
equality and the derivative formula. The right side (T, (S, gy)) of the displayed 
equation is well defined, since Proposition 5.13 shows that (S, g,) is in C° (RY). 


PROOF. Let N be a bounded open set containing support(T ), and choose a func- 
tion n € C&_ (RY) that is identically 1 on the set of differences support(g) — N. 
Proposition 5.7g shows that 


support(7T” * y) C support(g) + support(T”) 
= support(~) — support(T) 
¢ support(y) — N, 


and the fact that 7 is identically 1 on support(gy) — N implies that 
(n)(TY *9) =TY «9. (*) 
Meanwhile, Proposition 5.13 shows that 
(S, Qy) = (nS, Py) (6) 


for all y in N, hence for all y in support(7’). Therefore 


(T, (S, py)) = (T, (nS, py) by Ce) 
= (T, (nS)” * 9) by Corollary 5.10 
= (nS *T, 9) by Corollary 5.11 
= (nS, TY * ¢) by Corollary 5.10 
=(S,n(T’ *@)) by definition 
= (8, F# gy) by («). (+) 


For one of the derivative formulas, we have 
(D*(S *T), @) = (—1)*(S * T, D%g) = (—1I)"(S, (T, (D%@);)). 
Proposition 5.7b shows that this expression is equal to 


(-—1)(S, D®(T, gx)) = (D°S, (T, @x)), 
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and the definition of convolution shows that the latter expression is equal to 
((D°S)*T, g). Hence D°(S *T) = (D*S)* T. For the other derivative formula 
we have 


(D*(S*T), 9) = (“DPMS «T, Dég) = (-DI(T, (S, (D%9)y)). 
Proposition 5.13 shows that this expression is equal to 
(—D*(T, D*(S, gy) = (D°T, (S, gy), 
and step (+) shows that the latter expression is equal to 
(S, (D°T)” * g) = (S* (D°T), 9). 
Hence D°(S x T) = S« (D°T). 


For S in D’(R) and gy in C& (R”), we now define 


(SY « g)(y) = (S, py). 


Corollary 5.8 shows that this definition is consistent with our earlier definition 
when S is in the subset €’(R”) of D’(R%). Proposition 5.13 shows that the linear 
map g +> S x g is continuous from Cok) into C~(RY). 

Corollary 5.15. Let S be in D’(R”), T be in €’(R¥), and g be inC&,(R%). 
Then 


(S*T, 9) = (S, TY * 9) = (S, (T, @x)) = (T, (S, @y)) = (T, SY * 9), 


and (S* T)Y = SY * TY. 


PRooF. The displayed line just adds the above definition to the conclu- 
sion of Corollary 5.14. For the other formula we use Corollary 5.12 to write 
((S* T)”, pg) = (S*T, gp”) = (S,TY * g”) = (S, (T * 9)’) = (SY, T #9) = 
(SY * TY, 9). 


With the symmetry that has been established in Corollary 5.15, we allow 
ourselves to write T « S for S* T when S is in D’/(RY) and T is in €'(R). This 
notation is consistent with the equality S + T = T * S established in Proposition 
5.7f when S and T both have compact support. 
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Corollary 5.16. Suppose that S is in D/(R%), that f is in C©(RY), and that 
at least one of S and f has compact support. If g is in C°°_ (RY), then 


(S* Ty, ) = (S, fY *@). 
Moreover, S * Ty is given by the C® function y > (S,(f”)_-y), ie. 
S« Ty = Tr with F(y) = (S, (f”)-_y). 
REMARK. If both S and f have compact support, Corollary 5.16 reduces to 


Corollary 5.8. 


PROOF. First suppose that S has compact support. Theorem 5.1 allows us to 
write Sas (S,v) = 0, fen D* W dpa, with the sum involving only finitely many 
terms and with the complex Borel measures pg compactly supported. Applying 
Corollary 5.15 to S « Ty and using the definition of SY « g, we obtain 


(S* Ty, 9) = Jaw fY)(SY * 9)(y) dy 
= few 0) Xia Sign D% Gy) dpa (x) dy 
= fev iy Jen f(y) D% p(x + y) dpa (x) dy. 


Since g and the p,’s are compactly supported, we may freely interchange the 
order of integration to see that the above expression is equal to 


Ya Sew [Jew f(D ox + y) dy] doa (x) 
=, Saw (fY * D°G)(x) dpa(x) 
= oy Sgn (D*(FY) * 9) (x) dpa (x) 
=D, Sev [fav DAP’ )X — y)o(y) dy] dpa(x) 
= few [Da Jen D°F Y= y) dpa(x)]9(y) dy 
= fan (S, (fY)-y)9Q) dy 
= (Tr, 9), 


as asserted. 
Next suppose instead that f has compact support. Then 


(S * Ty, p) = (S, (Tp)” * g) = (S, Ty * g) = (S, fY * 9). (*) 


We are to show that this expression is equal to 


(Tr, 9) = (Tis,¢%)_5) 9) = Jan (S, (F)-y) 90) dy. (@*) 
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We introduce a member 7, of C ee RY ) that is identically 1 on the set of sums 


support( f”) + support(g). Since 7S is in €’(IR%), Corollary 5.8 shows that 


(nS, fY * 9) = Jan (nS, (fY)-y)9(9) dy = fen (S, n(fY)-y) 9) dy. 


In view of () and (+>), it is therefore enough to prove the two identities 


(nS, fY * 9) = (S, fY * 9) ) 


and 


Jen (S. n(fY)-y) 90) dy = fan (Ss (£Y)-y) 9) dy. (1) 


Since support(fY * g) © support(f’) + support(g), we have n(fY * yg) = 
f «and therefore (nS, fY *v) = (S,n(fY *y)) = (S, fY *@). This proves 
(+). 

To prove (++), it is enough to show that n(f’)_y = (f”)_y for every y in 
support(y). For a given y in support(g), there is nothing to prove at points x 
where (f”)_y(x) = 0. If (f”)_y@) 4 0, then fY(~ — y) # O and x — y is 
in support(f’). Hence x = y + (x — y) is in support(gy) + support(f’), and 
n(x)(f’)—-y(x) = (f’)-y(x). This proves (+7). 


Corollary 5.17. Convolution of two distributions, one of which has compact 
support, is consistent with convolution of smooth functions, one of which has 
compact support, in the sense that if f and g are smooth and one of them has 
compact support, then T, * Ty is given by the C® function T, * f and by the C® 
function Ty * g, and these functions equal g « f. 


PROOF. We apply Corollary 5.16 with S = T,, and we find that T, * Ty 
is given by the smooth function that carries y to (T,,(f’)-y). In tur, this 
latter expression equals fay g(x)(f’)-y(x) dx = fon g(x) f’(x — y)dx = 
San g(x) f(y —x)dx = (g * f)(y). Hence T, x f = g * f. Reversing the 
roles of f and g, we obtain Ty *g = fe g=gxf. 


Corollary 5.18. If R, S, and T are distributions and w and g are smooth 
functions, then 


(a) (Tx W)* go =T x (Ww x @) provided at least two of T, w, and g have 
compact support, 

(b) (S* 7T) *g = (S * g) * T provided at least two of S, T, and g have 
compact support, 

(c) Rx (S * T) = (R *« S) * T provided at least two of R, S, and T have 
compact support. 
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PRooF. Let 7 be in C® (R%). We make repeated use of Corollaries 5.15 


com 
through 5.17 in each part. For (a), we use associativity of convolution of smooth 


functions (Proposition 5.5b) to write 


(T * Tyxpsn) = (T, (W * gp)’ #n) = (T, (WY * pY) * n) 
= (T,pY * (pY *n)) = (T * Ty, Y *n) 
= ((T * Ty) * Ty, n). 


Thus T * Tysg = (T * Ty) * Ty. Since T * Tysg = Tracy) and (T * Ty) * Ty = 
Tray * Ty = T(rew)xo, We obtain T * (W * yg) = (T * yy) * g. This proves (a). 
For (b), we use (a) to write 


((S*T) *T,y,n) = (S*T, gp” *n) = (S, TY * (pY *n)) 
= (S, (TY * p’) * n) = (S, (T * 9)” * 0) 
= (S,(T * Ty)” *) = (S* (T * Ty), 0). 


Thus (S*7T)*T, = S*(T *T,). Since (S*T)*Ty = T,sxr)xo and S*(T *Ty) = 
S * Treg = Tsx(Txg), we obtain (S * T)*p@ = S * (T * @). 
For (c), we use (b) to write 


(R x (S*T),n) =(R,(S*T)Y *n) = (R, (SY * TY) *7) 
= (R, SY «(TY *n)) = (R*S, TY *n) 
= ((R*S)*T,n). 


Thus R x (S * T) = (R * S) * T, and (c) is proved. 


We conclude with a special property of one particular distribution. The Dirac 
distribution at the origin is the member of €’(R%) given by (5, ~) = g(0). It 
has support {0}. The proposition below shows that the differentiation operation 
D* on distributions equals convolution with the distribution D%6. 


Proposition 5.19. If T is in D’(R%) and if 5 denotes the Dirac distribution at 
the origin, then 6 * 7 = 7’. Consequently D“%é * T = D%T for every multi-index 
a. 

PROOF. For ¢ in ce (RY), Corollary 5.14 gives (6 * T, g) = (6, (T, @)) = 
(T, yg), and therefore 6 x T = T. Applying D® and using the second conclusion 
of Corollary 5.14, we obtain D%(6 * T) = 6 * (D°T) = D°T. 
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The final tool we need in order to make the theory of distributions useful for 
linear partial differential equations is the Fourier transform. Let us write F for 
the Fourier transform on the various places it acts, its initial definition being 
FFE) = fen fe" dx on L'(R%). Since the Schwartz space S(R™) 
is contained in L!(R"), this definition of F is applicable on S(R’ ), and it was 
shown in Basic that F is one-one from S(R” ) onto itself. We continue to use the 
same angular-brackets notation for S’(R™ ) as for D’/(R%) and €/(R%). Then, asa 
consequence of Corollary 3.3b, the Fourier transform is well defined on elements 
T of S’(R%) under the definition (F(T), ) = (T, F(g)) for g € S(RY), and 
Proposition 3.4 shows that F is one-one from S’(R”) onto itself. On tempered 
distributions that are L! or L* functions, F agrees with the usual definitions on 
functions. For f in L', the verification comes down to the multiplication formula: 


(F Tp, 9) = (Ty, Fo) = f f (x) (Foy(x) dx = [(F f(x) p(x) dx = (Trp, 9). 


For f in L?, we choose a sequence { f,} in L'! N L tending to f in L?, obtain 
(FT;,, ~) = (Trz,, 9%) for each n, and then check by continuity that we can pass 
to the limit. 

The formulas that are used to establish the effect of F on S(R”) come from 
the behavior of differentiation and multiplication by polynomials on Fourier 
transforms and are 


D°(Ff)(x) = F((—27i)'x*f) (x) 
and xPB(F f(x) = F(2mi) | D¥ f(x). 


Let us define the effect of D® and multiplication by x? on tempered distributions 
and then see how the Fourier transform interacts with these operations. If @ is 
in S(R”), then D%q is in S(R), and hence it makes sense to define D*T for 
T € S'(R*) by (D°T, g) = (—1)%(T, D%q). The product of an arbitrary smooth 
function on R™ by a Schwartz function need not be a Schwartz function, and thus 
the product of an arbitrary smooth function and a tempered distribution need not 
make sense as a tempered distribution. However, the product of a polynomial 
and a Schwartz function is a Schwartz function, and thus we can define x?T for 
T € S'(RY) by (xT, g) = (T,x®y). The formulas for the Fourier transform 
are then 


F(D°T) = (2ri)!"'x* F(T) 
and F(x? T) = (—27i)7'?|DB F(T). 
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In fact, we compute that (F(D°T), 9) = (D°T, Fg) = (—1)!*(T, D* Fe) = 
(—1)"\(7, F(—2mi)!x%Q)) = (ri)!\(F(L), x*@) = xi)!" (x FL), 9) 
and that (F(x? T), y) = (x8 T, Fo) = (T, xP Fo) = (T, F(2mi) "8! D8 y)) = 
(2ni) "(F(T ), D° yp) = (—20i)~9\(DP F(T), g). 

We have seen that the restriction map carries €’(IR”) in one-one fashion into 
S'(R%). Therefore we can identify members of €’(IR”) with certain members 
of S’(R”) when it is convenient to do so, and in particular the Fourier transform 
becomes a well-defined one-one map of €’(R%) into S’(R%). (The Fourier 
transform is not usable, however, with D’(R“).) The somewhat surprising fact is 
that the Fourier transform of a member of €’(R”) is actually a smooth function, 
not just a distribution. We shall prove this fact as a consequence of Theorem 
5.1, which has expressed distributions of compact support in terms of complex 
measures of compact support. 


Theorem 5.20. If T is a member of €’(R™) with support in a compact subset 
K of R%, then the tempered distribution F(T) equals a smooth function that 
extends to an entire holomorphic function on C’. The value of this function at 
z € CX is given by 
F(T )@) = (Te), 


and there is a positive integer m such that this function satisfies 
|D° (FT )(&)| < Cg + |é|)” 


for € € RY and for every multi-index £. 


REMARK. The estimate shows that the product of (7, e~77""“)) by a Schwartz 
function is again a Schwartz function, hence that the tempered distribution F(T) 
is indeed given by a certain smooth function. 


PROOF. Fix a compact set K’ whose interior contains K. Theorem 5.1 allows 
us to write 


(T, $0) = Djas<m SK D% 0 Aey 
for all go € C~ (RY). Replacing go by e777!) 
(Le EN) ae be ee apy Es 
which shows that z +> (T, e777!) is holomorphic in C% and gives the estimate 


[DE Ee O)) 2 inca De Dee **| dip le) = Col ee)” 


gives 


Replacing go by Fy with g in CX, (R”) gives 


(F(T), @) = (T, F0) = Viajem Seek: Dt FOE) dpa) 
= Viatam trex: DE Seenn € "> p(x) dx dpa (é) 
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= Diatem leek Jrepn Dee 7"** (x) dx dpa (E) 
= frerw (Diatem Seen: Doe 7" * dpa (&)) p(x) dx 


=f cgu(T, e 2) g(x) dx. 
Both sides are continuous functions of the Schwartz-space variable g on the dense 
subset C pon OI ), and hence the formula extends to be valid for y in S(R" ). This 


proves that F(T) is given on S(R) by the function x +> (T, e777"), The 
estimate on pé of this function has been obtained above, and the theorem follows. 


EXAMPLE. There is an important instance of the formula of the proposition 
that can be established directly without appealing to the proposition. The Dirac 
distribution 6 at the origin, defined by (5, g) = (0), has Fourier transform F(6) 
equal to the constant function 1 because (F(46), ¢) = (6, F(y)) = F(g)) = 
Jpn gdx = (T,, ¢), where T, denotes the distribution equal to the smooth func- 
tion 1. Therefore F(D%5) = (2mi)!*!x*T;, ie., F(D%5) equals the function 
x + (27i)!lx*, The formula of the proposition when T = D“6 says that this 
function equals (D5) (e~77'*“), and we can see this equality directly because 
(D%6, eae) = (—1)!2! (5, D%e727'x-C)) = (—1)!2! (—2zi)!@1x% (5, gr ORY 
= (2mi)l@lx@, 


We know that the convolution of two distributions is meaningful if one of them 
has compact support. Since the (pointwise) product of two general tempered 
distributions is undefined, we might not at first expect that the Fourier transform 
could be helpful with understanding this kind of convolution. However, Theorem 
5.20 says that there is reason for optimism: the product of the Fourier transform 
of a distribution of compact support by a tempered distribution is indeed defined. 
This is the clue that suggests the second theorem of this section. 


Theorem 5.21. If S is in €’(R%) and T is in S’(R”), then S * T isin S’(R"), 
and F(S «T) = F(S) F(T). 


PROOF. We know that S * T is in D’(R"), and we shall check that § * T is 
actually in S’(R%), so that F(S * T) is defined: We start with g in C&_(R%) 


and the identity (S * T, g) = (S, TY « g) = (SY, T * gy”). Since S has compact 
support, there is a compact set K and there are constants C and m such that 


(S*T,g)<C >) sup|D°(T *gY)(x)|=C DE ae De) 


jal<m xeK ja|<m xe 


=C Qo sup |(T, (D*(@"))")x)| =C DI sup |(T, (D%g)x) |. 


jal<m xeK jal<m xeK 
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Since T is tempered, there exist constants C’, m’, and k such that the right side is 
<CC’ sup | + ly)?)*D8(D*¢)x(y)|; 

la|<m, xeEK, 

|B|<m' yeRN 


in turn, this expression is estimated by Schwartz-space norms for g, and thus 
ST isin S'(RY). 
Now let g and y be Schwartz functions with g and F(y) in CX, (IR). Then 


com 


(F(T, *T), W) = (Ty *T, FW)) = (T, pY * FWY) 
= (F(T), F~' (gp * Fb) = (FC), (F-'(@Y NF" (F))) 
= (F(T), F-' (pb) = (F(T), (Fg) v) = (FQ) FT), bh), 


the next-to-last equality following since F~!(yY) = F(g) by the Fourier inver- 
sion formula. Since the w’s with F(y) in C°°_ (IR) are dense in S(R”), 


Fy *T) = Fg) F(T). (*) 


Finally let g and y be in C® (RY). Corollary 5.18 gives Tp *(S*T) = 


com 


(T, * S) * T. Taking the Fourier transform of both sides and applying (*) three 
times, we obtain 


F(p)F(S *T) = F(Ty * (S * T)) = F((Ty * S) * T) 
= F(T, * S)\F(T) = F(p)F(S)F(T). 


Hence we have (F(y)F(S « T), Ww) = (F(@)F(S)F(T), w) and therefore 


(F(S *T), FQ) v) = (F(S)F(T), Fg) v) forall g € C3,,(R”). 


com 


The set of functions F(y) is dense in S(R). Moreover, if nx — 7 in S(R%), 
then nw > nw in S(R”). Choosing a sequence of y’s for which F(¢) tends in 
S(RY) to a function in C&,,(R) that is 1 on the support of y, we obtain 


(F(S *T), W) = (F(S)FL), W). 


Since the set of w’s is dense in S(R” ), we conclude that F(S * 7) = F(S)F(T). 
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5. Fundamental Solution of Laplacian 


The availability of distributions makes it possible to write familiar partial differ- 
ential equations in a general but convenient notation. For example consider the 
equation Au = f in R’, where A is the Laplacian. We regard f as known and u 
as unknown. Ordinarily we might think of f as some function, possibly with some 
smoothness properties, and we are seeking a solution u that is another function. 
However, we can regard any locally integrable function f as a distribution Ty and 
seek a distribution T with AT = Ty. In this sense the equation Au = f in the 
sense of distributions includes the equation in the ordinary sense of functions. 

In this section we shall solve this equation when the distribution on the right 
side has compact support. To handle existence, the technique is to exhibit a 
fundamental solution for the Laplacian, i.e., a solution of the equation AT = 6, 
where 6 is the Dirac distribution at 0, and then to use the rules of Sections 2—3 for 
manipulating distributions.’ The argument for this special case will avoid using 
the full power of Theorem 5.21, but a generalization to other “elliptic” operators 
with constant coefficients that we consider in Chapter VII will call upon the full 
theorem. 

In this section we shall make use of Green’s formula for a ball, as in Proposition 
3.14. As we observed in a footnote when applying the proposition in the proof of 
Theorem 3.16, the result as given in that proposition directly extends from balls 
to the difference of two balls. The extended result is as follows: If Br and B, 
are closed concentric balls of radii ¢ < R and if u and v are C? functions on a 
neighborhood of E = Br (B2)°, then 


dv ou 
Av—vA = gy ee 
[ou v — vAu) dx [ (u a v ) do, 


where do is “surface-area” measure on 0F and the indicated derivatives are 
directional derivatives pointing outward from £ in the direction of a unit normal 
vector. 


Theorem 5.22. In RY with N > 2, let T be the tempered distribution 
—Q51 WN — 2)! x|--?) dx, where Qy_, is the area of the unit sphere SV~!. 
Then AT = 6, where 6 is the Dirac distribution at 0. 


REMARK. The statement uses the name f(x) dx for a certain distribution, 
rather than 7;, for the sake of readability. 


7 Although a fundamental solution for the Laplacian is being shown to exist, it is not unique. One 
can add to it the distribution Ty for any smooth function f that is harmonic in all of RN, 
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PROOF. We are to prove that each g in Cen) satisfies (AT, g) = (6, 9), 
i.e., that the second equality holds in the chain of equalities 


yO) = (5, 9) = (AT, 9) = (T, Ag) = —s—ira Ian “ERS. 


We apply Green’s formula as above with the closed balls Br and B, centered at the 
origin, with R chosen large enough so that support(g) C B?, withu = |x|~-%~, 
and with v = g. Writing r for |x| and observing that Au = 0 on Br — B, and 
that ae =-Vo9- 2 on the boundary of B,, we obtain 


Jug, (1 OP=PE—((@)(—f O))) NN ded = ogg, PPA 


On the left side the first term has |x - V@| i r bounded; hence its absolute value 
is at most a constant times te aB. € dw, which tends to 0 as € decreases to 0. The 
second term on the left side is —(N — 2)e~“—)) iS ge’! da, and it tends, as 
€ decreases to 0, to —(N — 2)Qy_19(0). The result in the limit as € decreases 
to 0 is that 

—(N —2)2y-19) = fan r-"-? Ag dx, 


and the theorem follows. 


Corollary 5.23. In R” with N > 2, let T be the tempered distribution 
a Pa (N — 2)7!|x|-“-® dx, where Qy_, is the area of the unit sphere S‘~!. 
If f is in €’(R%), then u = T x f is a tempered distribution and is a solution of 
Au = f. 


PROOF. Let 6 be the Dirac distribution at 0, so that AT = 6 by Theorem 5.22. 
Theorem 5.21 shows that T * f is a tempered distribution, and Corollaries 5.14 
and 5.19 give A(T * f) = (AT) * f =6* f = f, as required. 


BIBLIOGRAPHICAL REMARKS. The development in Sections 2-4 is adapted 
from Hormander’s Volume I of The Analysis of Linear Partial Differential 
Equations. 


6. Problems 


1. Prove that if U and V are open subsets of R with U C V, then the inclusion 


CSm(U) > C&¥m(V) is continuous. 


2. Prove that if g is in C&S.,(U), then the map Ww > wy of C~(U) into Ce, (U) 
is continuous. 
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Let U be a nonempty open set in RY. Any member Ty of €'(U) extends to a 
member T of €'(R”) under the definition (T, ) = (Ty, 9) for g € C~(R®). 
Prove that this is truly an extension in the sense that if g; is in C°(U) and if g 
is in C©(IR%) and agrees with g in a neighborhood of the support of Ty, then 
(T,¢) = (Ty, 9|,,) = (Tu. v1): 

Prove the following variant of Theorem 5.1: Let K and K’ be closed balls of RY 
with K contained in the interior of K’. If T is amember of €’(IR™) with support 
in K, then there exist a positive integer m and members g, of L?(K', dx) for 
each multi-index a with |a| < m such that 


(T, 9%) = Daim Jig: (D°@) 8a dx for ally € C@(RY). 


Let K be a compact metric space, and let 1 be a Borel measure on K. Suppose 
that @ = (x, y) is a scalar-valued function on R™ x K such that ®(-, y) is 
smooth for each y in K , and suppose further that every iterated partial derivative 
D{® in the first variable is continuous on R” x K. Define 


F(x) = fy ®(, y) duty). 


(a) Prove that any T in €’(RY) satisfies (7, F) = {Brie @(-, y))duy). 

(b) Suppose that ® has compact support inR™ x K. Prove that any SinD’(R) 
satisfies (S, F) = f,(S, ®(-, y)) du(y). 

Suppose that 7 is a distribution on an open set U in R% such that (7, ¢~) > 0 

whenever ¢~ is a member of C&,,(U) that is > 0. Prove that there is a Borel 

measure jz > 0 on U such that (T, g) = te gdwy forall g in Co (U). 


Verify the formula of Theorem 5.22 for g(x) = e~7"" , namely that 
Saw XI" (Ag) (x) dx = —Qn-1(N — 2)90) 


for this y, by evaluating the integral in spherical coordinates. 


Problems 8—11 deal with special situations in which the conclusion of Theorem 5.1 
can be improved to say that a distribution with support in a set K is expressible as the 
sum of iterated partial derivatives of finite complex Borel measures supported in K. 


8. 


This problem classifies distributions on R! supported at {0}. By Proposition 

3.5f let 1 be a member of C&,, (R!) with values in [0, 1] that is identically 1 for 

|x| < 5 and is 0 for |x| => 1. Suppose that T is a distribution with support at {0}. 

Choose constants C, M, and n such that |(T, g)| < C )7y_9 SUP x|}<M |D‘g(x)| 

for all g in C™(R!). 

(a) For e > 0, define n,(x) = n(e—'x). Prove for each k > 0 that there is a 
constant C;, independent of € such that sup,. [eo ne(x)| < Cpren*. 

(b) Using the assumption that T has support at {0}, prove that (7, g) = (T, neg) 
for every y in C®(R!). 
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(c) Suppose that y is of the form g(x) = w(x)x"*! with w in C~(R!). By 
applying (b) and estimating |(7, n-~)| by means of the Leibniz rule and (a), 
prove that this special kind of g has T(g) = 0. 

(d) Using a Taylor expansion involving derivatives through order n and a re- 
mainder term, prove for general g in C®(R!) that (T, g) is a linear combi- 
nation of (0), D' (0), ..., D"g(O), hence that T is a linear combination 
of 6, D'5,..., D6. 


9. By suitably adapting the argument in the previous problem, show that every 
distribution on R% that is supported at {0} is a finite linear combination of the 
distributions D®5, where 6 is the Dirac distribution at 0. 


10. Let the members x of R™ be written as pairs (x’, x") with x’ in R’ and x” in 
IR‘-“. Suppose that T is a distribution on R% that is supported in R“. By 
using a Taylor expansion in the variables x” with coefficients involving x’ and by 
adapting the argument for the previous two problems, prove that T is a finite sum 
of the form (T, g) = Dentin (D“9)|p1)> the sum being over multi-indices 
a involving only x” variables and each T,, being in €’(R"). (Educational note: 
The operators D® of this kind are called transverse derivatives to R“. The 
result is that T is a finite sum of transverse derivatives of compactly supported 
distributions on R“.) 


11. Using the result of Problem 9, prove the following uniqueness result to accom- 
pany Corollary 5.23: if f is a distribution of compact support in R” with N > 2, 
then any two tempered distributions vu on R% that solve Au = f differ by 
a polynomial function annihilated by A. Is this uniqueness still valid if u is 
allowed to be any distribution that solves Au = f ? 


Problems 12-13 introduce a notion of periodic distribution as any continuous linear 
functional on the space of periodic smooth functions on R%. Write T for the circle 
R/2z7Z, and let C°(T) be the complex vector space of all smooth functions on 
IR% that are periodic of period 27 in each variable. Regard C~(T") as a vector 
subspace of C®((—2z,, 277), and give it the relative topology. Then define P’(T) 
to be the space of restrictions to C°(T'") of members of €’((—2z, 27r)”). For S in 
P'(T), define the Fourier series of S to be the trigonometric series 7, -7n cxe!** 
with cy, = (S, e7**), 

12. Prove that the Fourier coefficients c, for such an S satisfy |cg| < CC. + || 

for some constant C and positive integer m. 


2ym/2 


13. Prove that any trigonometric series rez cre’** in which the c,’s satisfy |cg| < 


C(1 + |k|?)"/? for some constant C and positive integer m is the Fourier series 
of some member S of P’/(T). 


Problems 14—19 establish the Schwartz Kernel Theorem in the setting of periodic 
functions. We make use of Problems 25-34 in Chapter III concerning Sobolev spaces 
Ler ) of periodic functions. As a result of those problems, the metric on C°(T'”) 
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may be viewed as given by the separating family of seminorms || - || Lr)? k > 0, 
K 

and C®(T) is a complete metric space. The Schwartz Kernel Theorem says that 

any bilinear function B : C®(T™%) x C~(T™) — C that is separately continuous in 

the two variables is given by “integration with” a distribution on TN x TY = T7%, 

The analogous assertion about signed measures is false. 

14. Let B: C°(TY) x C%(T™) = C be a function that is bilinear in the sense of 
being linear in each argument when the other argument is fixed, and suppose that 
B is continuous in each variable. The continuity in the first variable means that 
for each y € C~(T), there is an integer k and there is some constant C vk Such 
that |B(y, w)| < Cy, «llell 2 (T%) for all g in C®(T), and a similar inequality 
governs the behavior in the v variable for each g. For integers k > 0 and M > 0, 
define 


Exu = {¥ €C°(") | |B, WI < Mllell acy forall p € C*(T")}. 


(a) Prove that each Ex y is closed and that the union of these sets on k and M 
Ise CE’). 

(b) Apply the Baire Category Theorem, and prove as a consequence that there 
exist an integer k > 0 and a constant C such that 


for all g and y in C®(TY). 


15. Let B beas in Problem 14, and suppose that k and C are chosen as in Problem 14b. 
Fix an integer K > N/2,and define k’ = k + K. Prove that 


B 
|B(D*g, DPW)| < Cllell ya eeu, Il p2,cr0) 


for all g and w in C~(T%) and all multi-indices @ and 6 with |a| < K and 
IB|< K. 


16. Let B,C, K, and k’ be as in Problem 15. Put bj, = B(e!’”), e'”) for | and 
m in ZN, and for each pair of multi-indices (a, 8) with |a| < k’ and |B| < k’, 
define 


y Dim (Ki) !elF Blom Beth e—imy 
1,meZN ( ye eye > mF’) 


|a’|<k’ |B'\sk’ 


Fa,p(x, y) = 


for (x, y) € T™ x T. Prove that this series is convergent in L?(T% x T). 


17. With B,C, K, and k’ be as in Problem 15 and with Fy, as in Problem 16 for 
la| < k’ and |B| < k’, define 


18. 


19. 
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B'y.¥) = > Qn)" i Fup (x, y)(D%)(x)(DPy)(y) dx dy 
la|<k’, 
|Blsk’ 


[-2,2]*% x[-2,2]% 

for g and y inC®(T). Prove that B’ is well defined for all g and y inC®(T) 
and that BY(e!! ©), ef) = Biel’, e!™)) for all ] and m in Z. 

With B’ as in the previous problem, prove that B’(g, w) = B(¢g, w) for all g and 
wv in C®(T), and conclude that there exists a distribution S in P’(T2") such 
that 

BY, v) =(S,9 @yv) 


for all g and y in C®(T') if g @ w is defined by (y @ W)(x, y) = G(x) WV). 


Let n be a function in Ce (RY with values in [0, 1] that is 1 for |x| < : and 


is 0 for |x| > 1. For f continuous on T!, the Hilbert transform 


: = ~y)d 
(H (nf))(x) = lim n(x — y) f(x — y)dy 
810 1 J\y|>e y 


exists as an L?(R!) limit. 

(a) Let C(T!) be the space of continuous periodic functions on R of period 277, 
and give it the supremum norm. Taking into account that H, as an operator 
from L?(R') to itself, has norm 1, prove that 


B(f, 3) = J, (A (nf )) 0) (ng) (x) dx 


is bilinear on C(T!) x C(T!) and is continuous in each variable. 
(b) Prove that there is no complex Borel measure p(x, y) on[—Zz, a ]* such that 
Bf, g) = Si-anP f(x)g(y) dp(x, y) for all f and gin C(T!). 


CHAPTER VI 


Compact and Locally Compact Groups 


Abstract. This chapter investigates several ways that groups play a role in real analysis. For the 
most part the groups in question have a locally compact Hausdorff topology. 


Section | introduces topological groups, their quotient spaces, and continuous group actions. 
Topological groups are groups that are topological spaces in such a way that multiplication and 
inversion are continuous. Their quotient spaces by subgroups are of interest when they are Hausdorff, 
and this is the case when the subgroups are closed. Many examples are given, and elementary 
properties are established for topological groups and their quotients by closed subgroups. 


Sections 2-4 investigate translation-invariant regular Borel measures on locally compact groups 
and invariant measures on their quotient spaces. Section 2 deals with existence and uniqueness in the 
group case. A left Haar measure on a locally compact group G is a nonzero regular Borel measure 
invariant under left translations, and right Haar measures are defined similarly. The theorem is that 
left and right Haar measures exist on G, and each kind is unique up to a scalar factor. Section 
3 addresses the relationship between left Haar measures and right Haar measures, which do not 
necessarily coincide. The relationship is captured by the modular function, which is a certain 
continuous homomorphism of the group into the multiplicative group of positive reals. The modular 
function plays a role in constructing Haar measures for complicated groups out of Haar measures for 
subgroups. Of special interest are “unimodular” locally compact groups G, i.e., those for which the 
left Haar measures coincide with the right Haar measures. Every compact group, and of course every 
locally compact abelian group, is unimodular. Section 4 concerns translation-invariant measures on 
quotient spaces G/H. For the setting in which G is a locally compact group and H is a closed 
subgroup, the theorem is that G/H has a nonzero regular Borel measure invariant under the action 
of G if and only if the restriction to H of the modular function of G coincides with the modular 
function of H. In this case the G invariant measure is unique up to a scalar factor. Section 5 
introduces convolution on unimodular locally compact groups G. Familiar results valid for the 
additive group of Euclidean space, such as those concerning convolution of functions in various L? 
classes, extend to be valid for such groups G. 


Sections 6-8 concern the representation theory of compact groups. Section 6 develops the 
elementary theory of finite-dimensional representations and includes some examples, Schur or- 
thogonality, and properties of characters. Section 7 contains the Peter-Weyl Theorem, giving an or- 
thonormal basis of L? in terms of irreducible representations and concluding with an Approximation 
Theorem showing how to approximate continuous functions on a compact group by trigonometric 
polynomials. Section 8 shows that infinite-dimensional unitary representations of compact groups 
decompose canonically according to the irreducible finite-dimensional representations of the group. 
An example is given to show how this theorem may be used to take advantage of the symmetry in 
analyzing a bounded operator that commutes with a compact group of unitary operators. The same 
principle applies in analyzing partial differential operators. 
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1. Topological Groups 


The theme of this chapter is the interaction of real analysis with groups. We shall 
work with topological groups, their quotients, and continuous group actions, all 
of which are introduced in this section. A topological group is a group G witha 
Hausdorff topology such that multiplication, as a mapping G x G — G, and in- 
version, as amapping G — G, are continuous. A homomorphism of topological 
groups is a continuous group homomorphism. An isomorphism of topological 
groups is a group isomorphism that is a homeomorphism of topological spaces. 


EXAMPLES. 

(1) Any discrete group, i.c., any group with the discrete topology. 

(2) The additive group R or C with the usual metric topology. The group 
operation is addition, and the inversion operation is negation. 

(3) The multiplicative groups R¥ = R — {0} and C* = C — {0}, with the 
relative topology from R or C. 

(4) Any subgroup of a topological group, with the relative topology. Thus, for 
example, the circle {z eC | lz} = 1} is a subgroup of C*. 


(5) Any product of topological groups, with the product topology. Thus, 
for example, the additive groups R“ and C% are topological groups. So is the 
countable product of two-element groups, each with the discrete topology; in this 
case the topological space in question is homeomorphic to the standard Cantor 
set in [0, 1]. 

(6) The general linear group GL(N, C) of all nonsingular N-by-N complex 
matrices, with matrix multiplication as group operation. The topology is the 
relative topology from C% *. Each entry in a matrix product is a polynomial in 
the 2N? entries of the two matrices being multiplied and is therefore continuous; 
thus matrix multiplication is continuous. Inversion is defined on the set where 
the determinant polynomial is not 0 and is given, according to Cramer’s rule, in 
each entry by the quotient of a polynomial function and the determinant function; 
thus inversion is continuous. By (4), the general linear group GL(N, R) is a 
topological group. 

(7) The additive group of any topological vector space in the sense of Section 
IV.1. The additive groups of normed linear spaces are special cases. 


In working with topological groups, we shall use expressions like 


au = {au |u € U} and Ub = {ub |u € U}, 
U'={u7! |ueU} and UV={uv|ueU, ve V}. 
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In any topological group every left translation y +> xy and every right translation 
y -» yx is a homeomorphism. The continuity of each translation follows by 
restriction from the continuity of multiplication, and the continuity of the inverse 
of a translation follows because the inverse of a translation is translation by the 
inverse element. For an abstract topological group, we write | for the identity 
element. 

Continuity of the multiplication mapping G x G — G at (1, 1) implies, for 
any open neighborhood V of the identity in G, that there is an open neighborhood 
U of the identity for which UU C V. Inversion, being a continuous operation of 
order two, carries open sets to open sets; therefore if U is an open neighborhood 
of the identity, so is U NU mil Combining these facts, we see that if V is an 
open neighborhood of the identity, then there is an open neighborhood U of the 
identity such that UU ley, 

Conversely if whenever V is an open neighborhood of the identity, there is an 
open neighborhood U of the identity such that UU -! C V, then it follows that the 
mapping (x, y) xy! is continuous at (x, y) = (1, 1). If also all translations 
are homeomorphisms, then (x, y) b> xy! is continuous, and it follows easily 
that x + x7! and (x, y) xy are continuous. 


Proposition 6.1. If G is a topological group, then G is regular as a topological 
space. 


PROOF. We are to separate by disjoint open sets a point x and a closed set 
F with x ¢ F. Since translations are homeomorphisms, we may assume x to 
be 1. Then V = F° is an open neighborhood of 1, and we can choose an open 
neighborhood U of 1 such that UU C V. Let us see that U Cc VY. From 
UU CVand1l1eU,wehaveU CY. Thus let y be in U“! — U. Since y is then 
a limit point of U and since U~'y is an open neighborhood of y, U~!y meets 
U. If zis in U~!y NU, then z = u~'y for some u in U, and so y = uz is in 
UU CV. Thus U" C V and U!'N F = &. Consequently G is regular. 


If H is a subgroup of G, then the quotient space G/H of left cosets aH 
results from the equivalence relation that a ~ b if there is some h in H with 
a = bh. The quotient space is given the quotient topology. Quotient spaces of 
topological groups are sometimes called homogeneous spaces. 


Proposition 6.2. Let G be a topological group, let H be a closed subgroup, 
and let g : G — G/H be the quotient map. Then g is an open map, and 
G/H is a Hausdorff regular space such that the action of G on G/H given by 
(g,aH) + (ga)H is continuous. Moreover, 


(a) G separable implies G/H separable, 
(b) G locally compact implies G/H locally compact, 
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(c) G is compact if and only if H and G/H are compact, 
(d) H normal in the group-theoretic sense implies that G/H is a topological 


group. 


PROOF. Let U be open. To show that g(U) is open, we are to show that 
q—'(q(U)) is open. But g~!(q(U)) = Une Uh, which is open, being the union 
of open sets. Hence g is open. 

To consider the action of G on H, we start from the continuous open mapping 
1xq:GxG-— Gx (G/H) given by (g,a) + (g,aH). This descends to 
a well-defined one-one mapping g : (G x G)/(1 x H) > G x (G/H) given 
by (g,a)U x H)  (g,aH), and the quotient topology is defined in such a 
way that this is continuous. The image under g of an open set is the same as 
the image under 1 x q of the same open set, and this is open. Therefore q is a 
homeomorphism. 

The mapping (g,a) t+» (ga)H is the composition of multiplication (g, a) > 
ga followed by q and is therefore continuous. Hence it descends to a continuous 
map (g,a)(1 x H) b (ga)H. If q~' is followed by this continuous map, the 
resulting map is (g, aH) +> (ga)H, which is the action of G on G/H. Hence 
the action is continuous. 

To see that G/H is regular, we are to separate by disjoint open sets a point x 
in G/H and aclosed set F with x ¢ F. The continuity of the action shows that 
we may assume x to be 1H. Then M = F‘° is an open neighborhood of 1H in 
G/H, and the continuity of the action at (1, 1H) shows that we can choose an 
open neighborhood U of 1 in G and an open neighborhood N of 1H in G/H 
such that UN C M. Let us see that N°! C M. Using the identity element of U, 
we see that N C M. Thus let y be in N“! — N. Since y is then a limit point of N 
and since U~'y is an open neighborhood of y (¢ being open), U~!y meets N. If 
zisinU~!yQN, then z = u~'y for some u in U,and so y = uzisinUN C M. 
Thus N‘! C M and N‘*'/N F = &. Consequently G/H is regular. 

To see that G/H is Hausdorff, consider the inverse image under q of a coset 
xH. This inverse image is xH as a subset of G, and this subset is closed in G 
since H is closed and translations are homeomorphisms. Thus G/H is T,, as 
well as regular, and consequently it is Hausdorff. 

Conclusion (a) follows from the fact that g is open, since the image under q of 
a countable base of open sets is therefore a countable base for G/H. Conclusion 
(b) is similarly immediate; the image of a compact neighborhood of a point is a 
compact neighborhood of the image point. 

In (c), let G be compact. Then H is compact as aclosed subset of a compact set, 
and G/H is compact as the continuous image of a compact set. In the converse 
direction let 2/ be an open cover of G. For each x in G,//is an open cover of the 
subset x H of G, which is compact since it is homeomorphic to H. Let V, be a 
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finite subcover of x H, and let 
Vy = {y € G| yH is covered by VY}. 


We show that V; is open in G. Let W, be the open union of the members of 
Y,. If y is in Vy, then yh is in W, for all A in H. For each such h, we use the 
continuity of multiplication to find open neighborhoods U;, of 1 and N;, of h in 
G such that U; yNp, © W,. As h varies, the sets N;, cover H. If {Np,,..., Nn,,} 
is a finite subcover, then each set (U;, M---U4,,) YNn, lies in W, and hence so 
does (Un, N-+-AUy,,)yH. Thus (U;, 1--- 9 Uy,,)y lies in V,, and V, is open. 

The definition of V, makes V,H = V,, and thus gq ‘@V« = X,. The open 
sets V, together cover G, and hence the open sets g Vy cover G/H. Since G/H is 
compact, some finite subcollection {qV;,,...,qVx,} covers G/H. The equality 
q7'qVx, = V,, for all j implies that {V;,,..., Vx, } is an open cover of G. Then 
Uj=1 Vx, 1s a finite subcollection of U/ that covers G. This proves (c). 

In (d), suppose that H is group-theoretically normal, and let V be an open 
neighborhood of 1 in G/H. Choose, by the continuity of the action on G/H, an 
open neighborhood U of 1 in G and an open neighborhood N of 1H in G/H such 
that UN C V. Then gU and N are open neighborhoods of the identity in G/H 
such that (qU)N C V. Hence multiplication in G/H is continuous at (1, 1). 
Since the map G — G/H given for fixed aH by g +> (ga)H is continuous, 
the descended map gH +» (gH)(aH) is continuous. Thus left translations are 
continuous on G/H, and multiplication on G/H is continuous everywhere. To 
see continuity of inversion on G/H, let V be an open neighborhood of 1 in 
G/H, and let U be an open neighborhood of 1 in G with U~! C q7!(V). Then 
q(U ay C V, and inversion is continuous at the identity. Since left and right 
translations are continuous on G/H, inversion is continuous everywhere. This 
completes the proof. 


Proposition 6.3. If G is a topological group, then 
(a) any open subgroup 4H of G is closed and the quotient G/H has the discrete 
topology, 
(b) any discrete subgroup H of G (i.e., any subgroup whose relative topology 
is the discrete topology) is closed. 


REMARK. Despite (a), a closed subgroup need not be open. For example, the 
closed subgroup Z of integers is not open in the additive group R. 


PROOF. For (a), if H is an open subgroup, then every subset x H of G is open 
in G. Then the formula H = G — L),¢,,xH shows that H is closed. Also, 
since G — G/H is an open map, the openness of the subset x H of G implies 
that every one-element set {x H} in G/H is open. Thus G/H has the discrete 
topology. 
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For (b), choose by discreteness an open neighborhood V of 1 in G such that 
HV = {1}. By continuity of multiplication, choose an open neighborhood U 
of 1 with UU C V. If A is not closed, let x be a limit point of H that is not 
in H. Then the neighborhood U~'x of x must contain a member h of H, and h 
cannot equal x since x is not in H. Write u~!x = h withu € U. Thenu = xh! 
is a limit point of H that is not in H, and we can find h’ 4 1 in A such that h’ 
isin Uu. But Uu C UU C V,andsoh’ isin HN V = {1}, contradiction. We 
conclude that H contains all its limit points and is therefore closed. 


A compact group is a topological group whose topology is compact Hausdorff. 
Similarly a locally compact group is a topological group whose topology is lo- 
cally compact Hausdorff. Among the examples at the beginning of this section, the 
following are locally compact: any group with the discrete topology, the additive 
groups R and C, the multiplicative groups R* and C%, the circle as a subgroup 
of C%, the additive groups R% and C%, the general linear groups GL(N, R) 
and GL(N, C), and the additive groups of finite-dimensional topological vector 
spaces. An arbitrary direct product of compact groups, with the product topology, 
is acompact group. Similarly any finite direct product of locally compact groups 
is a locally compact group. 

A number of interesting subgroups of GL(N, R) and GL(N, C) are defined 
as the sets of matrices where certain polynomials vanish. Since polynomials are 
continuous, these subgroups are closed in GL(N, R) or GL(N,C). The next 
proposition says that they provide further examples of locally compact groups. 


Proposition 6.4. Any closed subgroup of a locally compact group is a locally 
compact in the relative topology. 


PROOF. Let G be the given locally compact group, and let H be the closed 
subgroup. As a subgroup of a topological group, H is a topological group. For 
local compactness, choose a compact neighborhood U;), in G of any element h of 
H. Then U;,H is acompact set in H since H is closed, and it is a neighborhood 
of h in the relative topology. Thus h has a compact neighborhood, and H is a 
locally compact group. 


EXAMPLES OF CLOSED SUBGROUPS OF GL(N, R) AND GL(N, C). 


(1) Affine group of the line. This consists of all matrices ( ) with a and b 


real and witha > 0. 


(2) Upper triangular group over R or C. This consist of all matrices whose 
entries on the diagonal are all nonzero, whose entries above the diagonal are 
arbitrary, and whose entries below the diagonal are 0. 
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(3) Commutator subgroup of previous example. This consists of all matrices 
whose entries on the diagonal are all 1, whose entries above the diagonal are 
arbitrary in R or C, and whose entries below the diagonal are 0. 


(4) Special linear group SL(N, F) with F equal to R or C. This consists of all 
N-by-N matrices with determinant 1. 


(5) Symplectic group Sp(N, F) with F equal to R or C. This consists of all 


2N-by-2N matrices g with determinant 1 such that g" ie a ) g= Ce a ): 


(6) Unitary group U(N). This consists of all N-by-N complex matrices g that 
are unitary in the sense that g"g = 1. The group is compact; the compactness of 
the topology follows since the members of U(N) form a closed bounded subset 
of a Euclidean space. The group SU(N) is the subgroup of all g in U(N) with 
determinant 1; it is a closed subgroup of U(N) and hence is compact. 

(7) Orthogonal group O(N) and rotation group SO(V). The group O(N) 
consists of all N-by-N real matrices that are orthogonal in the sense that g"g = 1; 
it is the intersection! of the unitary group U(N) with GL(n, R). Members of 
O(N) have determinant +1. The subgroup SO(N) consists of those members of 
O(N) with determinant 1, i.e., the rotations. The groups O(N) and SO(N) are 
compact. 


Proposition 6.5. If G is a locally compact group, then 


(a) any compact neighborhood V of 1 with V = V~! has the property that 
H =~, V” is a o-compact open subgroup, 
(b) G is normal as a topological space. 

PRooF. The set V” is the result of applying the multiplication mapping to 
Vx--- x V with n factors. This mapping is continuous, and hence V” is 
compact. Thus H is o-compact. Since V"V" = V™*", H is closed under 
multiplication. Since V = V~!, we have V" = (V~!)" = (V"”)7!, and H is 
closed under inversion. Thus H is a subgroup. Since V is a neighborhood of 1, 
Vx is a neighborhood of x. Therefore V"*! is a neighborhood of each member 
of V", and H is open. This proves (a). 

Let H be as in (a). The subspace H of G is o-compact and hence Lindelof, and 
Tychonoff’s Lemma” shows that it is normal as a topological subspace. Let {xy} 
be a complete system of coset representatives for H in G,so thatG = LU, xeH is 
exhibited as the disjoint union of open closed sets, each of which is topologically 
normal. If & and F are disjoint closed sets in G, then E 1 xyH and FN xyH 
are disjoint closed sets in x,H. Hence there exist disjoint open sets Uy and Vy 
in xyH such that EN x.H C Uy and FN xyH C Vy. Then U = LU, Ug and 


'This fact provides justification for using the term “unitary” in Proposition 2.6 even when F = R. 
Proposition 10.9 of Basic. 
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V =U, Vz are disjoint open sets in G such that E C U and F C V. This proves 
(b). 


The final proposition of the section shows that members of Coom(G) are 
uniformly continuous in a certain sense that can be defined without the aid of 
a metric. 


Proposition 6.6. If G is a locally compact group and f is in Coom(G), then 
for any € > 0, there is an open neighborhood W of the identity with W = W7! 
such that xy~! € W implies | f(x) — f(y)| < . 


PROOF. Let S be the support of f, and let € > 0 be given. For each y in S, let 
U, be an open neighborhood of y such that x € Uy implies | f(x) — f(y)| < €/2. 
Since Uyy! is a neighborhood of 1, we can find an open neighborhood Vy, of 1 
with V, = Vy and V,V, Ue As y varies through S, the sets V,y 
form an open cover of S. Let {V),yi1,..., Vy, ¥n} be a finite subcover, and put 
W=V,,---NVy,. This will be the required neighborhood of 1. 

To see that W has the property asserted, let xy~! bein W. If f(x) = f(y) =0, 
then | f(x) — f(y)| < €. If f(y) ¥ 0, then for some k, y is in Vy, yz © 
Uy Vy Ye = U,, and thus | f(y.) — f(y)| < €/2. Also, x = (xy7!)y is in 
WV. 9% & Vy Vy ve © Uy yy ye S Uy, and thus | f(x) — f (x)| < €/2. Hence 
| f(x) — f()| < €. Finally if f(x) 4 0, then W = W~! implies that yx7! is in 
W, the roles of x and y are interchanged, and the proof that | f(x) — f()| < € 
goes through as above. 


Corollary 6.7. If G is a locally compact group and f is in Coom(G), then the 
map of G x G into C(G) given by (g, g’) B f(g(-)g’) is continuous. 


PROOF. We first prove two special cases. If go € G and € > 0 are given, 
then Proposition 6.6 produces an open neighborhood W of the identity such 
that sup,<g If (gx — f(gox)| < € for ge. in W, and hence g + f(g(-)) is 
continuous. Applying this result to the function f given by fl (x) = f(x) 
and using continuity of the inversion map x + x~! within G, we see that 
gt f((-)g’) is continuous. 

Now we reduce the general case to these two special cases. If (go, gj) is given 
in G x G, then 


If (gxg’) — f(goxgo)l < If (gxg’) — f(goxg’)| + lf (goxg’) — f (goxgo)| 
< au If (gx) — f (gox)| + ay lf eg’) — f &go)l- 


The two special cases show that the right side tends to 0 as (g, g’) tends to (go, g9), 
and the corollary follows. 
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If G is a group and X is a set, a group action of G on X is a function 
G x X — X, often written (g, x) t gx, such that 


(i) lx = x for all x in X, 
Gi) gi(gox) = (gig2)x for all x in X and all g; and g2 inG. 


If G is a topological group and X has a Hausdorff topology, a continuous group 
action is a group action such that the map (g, x) +> gx is continuous. In this case 
we say that G acts continuously on X. The fundamental example is the action of 
G on the quotient space G/H by aclosed subgroup: (g, g'H) > (gg’)H. 

An orbit for a group action of G on X is any subset Gx of X. The action 
is transitive if there is just one orbit, i.e., if Gx = X for some, or equivalently 
every, x in X. This is the situation with the fundamental example above. The 
action of the general linear group GL(N, R) on R™ by matrix multiplication is a 
continuous group action that is not transitive; it has two orbits, one open and the 
other closed. 

Let G act continuously on X, fix xo in X, and let H be the subgroup of elements 
hinG with hxo = xo. This is the isotropy subgroup at xo. Itis aclosed subgroup, 
being the inverse image in G of the closed set {xo} under the continuous function 
g +> gxo. Proposition 6.2 shows that the quotient topology on the set G/H of left 
cosets is Hausdorff. Since G/H has the quotient topology, the continuous map 
G — Gxo given by g + gxo descends to a one-one continuous map G/H — 
Gx. In favorable cases the map G/H — Gx is a homeomorphism with its 
image, and Problems 2-4 at the end of the chapter give sufficient conditions for 
it to be a homeomorphism. Sometimes the ability to do serious analysis on X 
depends on having the map be a homeomorphism. A case in which it is not a 
homeomorphism is the action of the discrete additive line G on the ordinary line 
X = R by translation. 


2. Existence and Uniqueness of Haar Measure 


The point of view in Basic in approaching the Riesz Representation Theorem 
for a locally compact Hausdorff space X was that the steps in the construction 
of Lebesgue measure work equally well with X. The only thing that is missing 
is some device to encode geometric data—to provide a generalization of length. 
That missing ingredient is captured by any positive linear functional on Ceom(X), 
but there is no universal source of interesting such functionals. 

For the next few sections we shall impose additional structure on X , assuming 
now that X is a locally compact group in the sense of Section 1. We shall see in 
this case that a nonzero positive linear functional always exists with the property 
that it takes equal values on a function and any left translate of the function. 
In other words the positive linear functional has the same kind of invariance 
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property under translation as the Riemann integral. The corresponding regular 
Borel measure, which is Lebesgue measure in the case of the line, is called a (left) 
“Haar measure” and is the main object of study in Sections 2-5 of this chapter. 

Several examples of locally compact groups were given in Section 1. Among 
them are the circle group, the additive group R%, and the general linear groups 
GL(N, C) and GL(N, R), which consist of all N-by-N nonsingular matrices and 
have matrix multiplication as the group operation. Proposition 6.4 showed that 
any closed subgroup of a locally compact group is itself a locally compact group. 
Special linear groups, unitary groups, orthogonal groups, and rotation groups are 
among the examples that were mentioned. 

Thus let G be a locally compact group. We shall write the group multiplica- 
tively except when we are dealing with special examples where a different notation 
is more suitable. Ordinarily no special symbol will be used for a translation map 
in G. Thus left translations are simply the homeomorphisms x +> gx for g inG, 
and right translations are the maps x +> xg. 

Let us consider these as special cases of what any continuous mapping does. 
The notation will be clearer if we distinguish the domain from the image. Thus let 
® be a continuous mapping of a locally compact Hausdorff space X into a locally 
compact Hausdorff space Y. The mapping ® carries subsets of X to subsets of 
Y by the rule P(E) = {P(x) | x € E}. 

If ® is ahomeomorphism, it preserves the topological character of sets. Thus 
compact sets go to compact sets, G3’s go to Gs’s, and so on. Consequently Borel 
sets map to Borel sets, and Baire sets map to Baire sets. 

By contrast a scalar-valued function f on Y pulls back to the scalar-valued 
function f® on X given by f?(x) = f(®(x)), with continuity being preserved. 
A Borel measure on X pushes forward to a measure fw» on Y given by 
Lo (E) = (®~!(E)); the measure jz is defined on Borel sets but need not be 
finite on compact sets. If ® is a homeomorphism, however, then jz is a Borel 
measure, and regularity of jz implies regularity of uo. 

Of great importance for current purposes is the effect of ® on integration, 
where the effect is that of a change of variables. The formula is 


[ tean=[ Fane 


if f is a Borel function > 0, for example. To prove this formula, we first 
take f to be the indicator function [gz of a subset E of Y. On the left side we 
have TP (x) = Ip (®(x)) = Ip-1(g)(x). Hence the left side equals ie i du = 
p(®!(E)) = w?(E), which in turn equals the right side ie Iz dito. Linearity 
allows us to extend this conclusion to nonnegative simple functions ,and monotone 
convergence allows us to pass to Borel functions > 0. 
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An important consequence of the boxed formula is the formula 
(Fdp)o =F* dye. 


In fact, if we set f = F° "leg in the boxed formula, then we obtain de FIP du = 
fy FP Igduo. Thus feign) Fdu = f,F® due and (Fdu)g(E) = 


(F dp)(®(E)) = fog Fdu = fp F® | dpe = (F*' due)(E). 

The Euclidean change-of-variables formula® is a special case of the boxed 
formula, and the content of the theorem amounts to an explicit identification of Wo. 
Let g : U + g(U) bea diffeomorphism with det gy’ (x) nowhere 0. If y = g(x), 
then the formula gives dy = | det g’(x)| dx. Since dy = d(g(x)) = (dx),-1, the 


formula is saying that (dx),-1 = | det y’(x)| dx. We recover the usual Euclidean 
integration formula by applying the boxed formula with 6 = g~!, X = g(U), 
Y =U,dy = dy, and du, = | det g’(x)| dx, and then by letting F = ce 


The result is Lowy F(y) dy = fy F(g(x))| det y’(x)| dx, as it should be. 

The rule for composition for points and sets is that (VW o ®)(x) = Y(P(x)) 
and (YW o ®)(E) = W(®(E£)). But for functions and measures the rules are 
fas = (f*)* and woo = (o)w. In other words, when ® is followed by 
W in operating on points and sets, ® is again followed by V in pushing forward 
measures, but Y is followed by ® in pulling back functions. In the special 
case that X = Y = G, this feature will mean that certain expressions that we 
might want to write as triple products do not automatically satisfy an expected 
associativity property without some adjustment to the notation. 

First consider left translation. On points, left translation L;, by h sends x to 
hx, and left translation by g sends this to g(hx) = (gh)x. The behavior on 
sets is similar. On functions and measures we therefore have f4s# = flsth = 
(f'")"s and wy, = ML,L, = (ML,)L,- To obtain group actions on functions and 
measures, we therefore define 


(gf)Qx) = fle) = f(g7'x) and = (gu)(E) = wi, (E) = w(g'E) 


for g in G. With these definitions we have g(hf) = (gh) f and g(hu) = (gh), 
consistently with the formulas for a group action. 

With right translation the effect on points is that right translation by h sends x 
to xh, and right translation by g sends this to (xh)g = x(hg). The behavior on 
sets is similar. We want the same kind of formula with functions and measures, 
and to get it we define 


(fg(x) = f(xg') and — (ug) (E) = nh (Eg7') 


3Theorem 6.32 of Basic. 
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for g in G. With these definitions we have (fh)g = f(hg) and (uh)g = wlhg). 
These are the formulas of what we might view as a “right group action.” 

A nonzero regular Borel measure on G invariant under all left translations is 
called a left Haar measure on G. A right Haar measure on G is a nonzero 
regular Borel measure invariant under all right translations. The main theorem, 
whose proof will occupy much of the remainder of this section, is as follows. 


Theorem 6.8. If G is a locally compact group, then G has a left Haar measure, 
and it is unique up to a multiplicative constant. Similarly G has a right Haar 
measure, and it is unique up to a multiplicative constant. 


Before coming to the proof, we give some examples. Checking the invariance 
in each case involves using the boxed formula above for some homeomorphism 
®. In Euclidean situations we can often evaluate j1@ directly by the change-of- 
variables formula for multiple integrals. In an abelian group the left and right 
Haar measures are the same, and we speak simply of Haar measure; but this need 
not be true in nonabelian groups, as one of the examples will illustrate. 


EXAMPLES. 
(1) G = R% under addition. Lebesgue measure is a Haar measure. 


(2) G = GL(N,R). Problem 4 in Chapter VI of Basic showed that if My is 
the N?-dimensional Euclidean space of all real N-by-N matrices and if dx refers 
to its Lebesgue measure, then 

) dx 
| det x|% 


f (gx) 


Wi | det x | 


f() 
My 


for each nonsingular matrix g and Borel function f > 0. In the formula, gx is 
the matrix product of g and x. Problem 10 in the same chapter showed that the 
zero locus of any polynomial that is not identically zero has Lebesgue measure 0. 
Thus the set where det x = 0 has measure 0, and we can rewrite the above formula 
as 


[tena f so ce 
gx) ———_ = x) ———_.,, 
GL(N,R) | det x|% GL(N,R) | det x|% 


where dx is still Lebesgue measure on the underlying Euclidean space of all 
N-by-N matrices. This formula says that roa is a left Haar measure on 
GL(N, R). This measure happens to be also a right Haar measure. 


(3)G= (3 I with real entries and a > 0. Then a~? da db is a left Haar 


measure and a~! da db is a right Haar measure. To check the first of these asser- 


tions, let y be left translation by Ce . ). Since (@ 2) (e a = ( ~ ae Ve 
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we can regard ¢ as the vector function @ ee) — Gace with ¢’ CG) = (e Q ) 


0 ao 
and | det y’ (7) | = aj. Then (dadb)y1 = ajdadb and (a~*dadb)y1 = 
(a~?)? (da db) g-1 = (aga) ~7a6, dadb = a~? dadb. So a~? da db is indeed a 
left Haar measure. By a similar argument, a~! da db is a right Haar measure. 


We shall begin the proof of Theorem 6.8 with uniqueness. The argument will 
use Fubini’s Theorem for certain Borel measures on G, and we need to make two 
adjustments to make Fubini’s Theorem apply. One is to work with Baire sets, 
rather than Borel sets, so that the product o-algebra from the Baire sets of G 
times the Baire sets of G is the o-algebra of Baire sets for G x G.+ The other is 
to arrange that the spaces we work with are o-compact. The device for achieving 
the o-compactness is Proposition 6.5, which shows that G always has an open 
o-compact subgroup H. Imagine that we understand the restriction of a left Haar 
measure 44 to H. We form the left cosets gH, all of which are open in G. Any 
compact set is covered by all these cosets, and there is a finite subcover. That 
means that any compact set K is contained in the union of finitely many cosets 
gH, say in gH U---U g,H. We can compute yz on any gH by translating the 
set by g~!. This fact and the formula w(K) = yi L(K 1 g; A) together show 
that we can compute j4(K) from a knowledge of 4 on H. Thus there is no loss 
of generality in the uniqueness question in assuming that G is o-compact. 


PROOF OF UNIQUENESS IN THEOREM 6.8. As remarked above, G has a 
o-compact open subgroup H, and it is enough to prove the uniqueness for H. 
Changing notation, we may assume that our given group is o-compact. We work 
with Baire sets in this argument. 

Let 2; and [12 be left Haar measures. Then the sum ys = 4, + 2 is a left Haar 
measure, and 4(E) = O implies 4;(E) = 0. By the Radon—-Nikodym Theorem,> 
there exists a Baire function h; > 0 such that w,; = hy dw. Fix g inG. By the 
left invariance of 4; and ww, we have 


[ f(x)hi(g7!x) du(x) = [ f (gx)ai(x) d(x) = i: f (gx) dui(x) 


= | teoauisy =f roomoane 


for every Baire function f > 0. Therefore the measures hi(g7!x) d(x) and 
hi(x) d(x) are equal, and h;(g~!x) = hy(x) for almost every x € G (with 
respect tod). We can regard hy; (g—!x) and h(x) as functions of (g,x) € GxG, 


4Proposition 11.17 of Basic. 
>Theorem 9.16 of Basic. 
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and these are Baire functions since the group operations are continuous. For each 
g, they are equal for almost every x. By Fubini’s Theorem they are equal for 
almost every pair (g, x) (with respect to the product measure), and then for almost 
every x they are equal for almost every g. Pick one such x, say xo. Then it follows 
that h1(x) = h1(xo) for almost every x. Thus du, = hi(xo) du. So dy isa 
multiple of du, and so is diz. 


Now we turn our attention to existence. The shortest and best-motivated known 
proof dates from 1940 and modifies Haar’s original argument in two ways that we 
shall mention. First let us consider that original argument, in which the setting is 
a locally compact separable metric topological group. In trying to construct an 
invariant measure, there is not much to work with, the situation being so general. 
We can get an idea how to proceed by examining R%, where we are trying to 
construct Lebesgue measure out of almost nothing. We do have some rough 
comparisons of size because of the compactness. If we take a compact geometric 
rectangle and an open geometric rectangle, the latter centered at the origin, the 
compactness ensures that finitely many translates of the open rectangle together 
cover the compact rectangle. The smallest such number of translates is a rough 
estimate of the ratio of their Lebesgue measures. This integer estimate in some 
sense becomes more refined as the open rectangle gets smaller, but the integer in 
question grows in size also. To take this scaling factor into account, we compare 
this integer ratio with the integer ratio for some standard compact rectangle as 
the open rectangle gets small. This ratio of two integer ratios appears to be a 
good approximation to the ratio of the measure of the general compact rectangle 
to the measure of the standard compact rectangle. In fact, one easily shows that 
this ratio of ratios is bounded above and below as the open rectangle shrinks 
in size through a sequence of rectangles to a point. The Bolzano—Weierstrass 
Theorem gives a convergent subsequence for the ratio of ratios. It turns out that 
this convergence has to be addressed only for countably many of the compact 
rectangles, and this we can do by the Cantor diagonal process. Then we obtain 
a value for the measure of each compact rectangle in the countable set and, as 
a result, for all compact rectangles. It then has to be shown that we can build a 
measure out of this definition of the measure on compact rectangles. 

Two things are done to modify the above argument to obtain a general proof 
for locally compact groups. One is to replace the Cantor diagonal process by an 
application of the Tychonoff Product Theorem. The other is to bypass the long 
process of constructing a measure on Borel sets from its values on compact sets 
by instead using positive linear functionals and applying the Riesz Representation 
Theorem. Once an initial comparison can be made with continuous functions of 
compact support, rather than compact sets and open sets, the path to the theorem 
is fairly clear. It is Lemma 6.9 below that says that the initial comparison can be 
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carried out with such functions. For a locally compact group G, let C*,,(G) be 
the set of nonnegative elements in Cgom(G). 


Lemma 6.9. If f and y are nonzero members of Cx,,,(G), then there exist 
a positive integer n, finitely many members g1,..., g, of G, and real numbers 


C1,..., Cp all > O such that 


n 
f@< S > cip(gix) for all x. 
j=l 

REMARK. We let H(f, g) be the infimum of all finite sums ae cj as in the 
statement of the lemma. The expression H (f, ¢) is called the value of the Haar 
covering function at f and ¢. 

PROOF. Fix c > IF ll sup/ lll sup: The set U = {x | cy(x) > IF llsup} is open 
and nonempty, and the sets hU, for h € G, form an open cover of the support 
of f. Choose a finite subcover, writing 

support(f) CU U---UA,AU. 
For | < j <n, we then have 
hjU = {x | hj'x € U} = {x | co(hj'x) > If llsup) 


C fx | f@) S Dj coy x}. 
Hence 
support(f) € {x | f(x) < Yj-1 cp(h; 'x)}. 


The lemma follows with g; = hb and with all c; equal to c. 


Lemma 6.10. The Haar covering function has the properties that 


(a) H(gf, 9) =HAC(f, ¢) for g inG, 

(b) A(fit+ f.¢) < A(fi.9)+ A(h. ¢), 
(c) H(cf,~) =cH(f, ») force > 0, 

(d) fi < fo implies H( fi, ~) < H(fo, 9), 
ec) A(f,W <A OAHY,W), 

(f) H(f, 9) = WF lleup/lPllsup- 


PROOF. Properties (a) through (d) are completely elementary. For (e), the 
inequalities f(x) < 0; c:p(gix) and g(x) < par dj (hjx) together imply that 
f@< ij cidj W (hj gix). Therefore 

H(f,w) < inf) , cidj = (inf D7, ci) (inf 0 d)) = ACF A.W). 
For (f), the fact that a continuous real-valued function on a compact set attains its 
maximum value allows us to choose y such that f(y) =]| a al eee Then || f Pt = 


fy) < Nj eroleiy) < Xj cill@lleap and hence [If ll,up/llPllsup < Lj 7. Tak 
ing the infimum over systems of constants c; gives If llsup/ IP llsup < ACf, 9). 
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Following the outline above, we now perform the normalization. Fix a nonzero 
member fo of C*,,(G). If g and f are nonzero members of C~,,,(G), define 


(f) = H(f.¢)/H(fo. 9). 


After listing some elementary properties of £,, we shall prove in effect that fy is 
close to being additive if the support of g is small. 


Lemma 6.11. £,(f) has the properties that 


(@) 0< agp Sof) SHS, fo). 

(b) ly(sf) = &y(f) for ginG, 

(c) £9(fi + fa) < lo(fi) + Lo(f2), 

(d) fy(cf) = cly(f) if c > 0 is a constant. 


PROOF. Properties (b), (c), and (d) are immediate from (a), (b), and (c) of 
Lemma 6.10. For (a), we apply Lemma 6.10e with g there equal to fp and with 
w there equal to ¢, and the resulting inequality is H(f, g) < ACf, fo) (fo. ¢). 
Thus £,(f) < AH(f, fo). Then we apply apply Lemma 6.10e with f there equal 
to fo, g there equal to f, and w there equal to g. The resulting inequality is 
H(fo. 9) < H(fo. fyH(f, g). Thus 1/H (fo, f) < €9(f). 


Lemma 6.12. If f| and f) are nonzero members of C+,,,(G) and if € > 0 is 


given, then there exists an open neighborhood V of the identity in G such that 


Lo(fi) + ly(fa) = lofi + fa) +€ 


for every nonzero ¢ in Cx,,(G) whose support is contained in V. 


PROOF. Let K be the support of fi + fo, and let F be a member of Coom(G) 
with values in [0, 1] such that F is 1 on K. The number € > 0 is given in the 
statement of the lemma, and we let 6 be a positive number to be specified. Define 
f=hfitft+6F,h, = fi/f,and hz = f/f, with the convention that h; and 
hz are 0 on the set where f is 0. 

The functions h; and hz are continuous: In fact, there is no problem on the 
open set where f(x) #0. Ata point x where f(x) = 0, the functions h; and ho 
are continuous unless x is a limit point of the set where f; + f2 is not 0. This 
set is contained in K , and thus x must be in K. On the other hand, F is 1 on K, 
and hence f is > 5 on K. Hence there are no points x where h, or ho fails to be 
continuous. 

Let 7 > O be another number to be specified. By Proposition 6.6 let V be an 
open neighborhood of the identity such that V = V~! and also 


lhy(x) —hi(y)| < 7 and |h2(x) — ha(y)| < 9 
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whenever xy! isin V. If g € Cx,,,(G) has support in V and if positive constants 


c; and group elements g; are chosen such that f(x) < )> j CiP(8jX) for all x, 
then every x for which y(g;x) > 0 has the property that 


lai(g7')—MiGe)| <n and |ha(g7!) — ha(x)| <0. 


Hence 


J 


fie) = f@)hi(ae) < Yo cj(gixhie) < Yo (cig; ') +0) 9(gjx)- 
J 


Consequently 


H(fi, 9) < Do (ci i(gy') +0). 


J 
Similarly 
H( fo. 9) < Y- (cj(ta(g7') +n). 


J 


Adding, we obtain 


H(fi, 9) + H( fo, 9) Sd) (cj(hi(g5') + ho(g7') +2n)) < D> ej(1 + 2n) 
J j 


since h, + hz < 1. Taking the infimum over the c;’s and the g;’s gives 
H(fi,9) + H(fo, 9) < Hf, ¢)d + 2n). 


Therefore 


Lo(fi) + lo (f2) 
< ly(f)C + 2n) 
< (oth + fo) + 5€y(F))(1 + 2n) by (c) and (d) in Lemma 6.11 


<lo(fi + fo) + (SAF, fo) + 25nH(F, fo) + 2nH (fi + fr, fod), 


the last inequality holding by Lemma 6.11a. This proves the inequality of the 
lemma if 5 and 7n are chosen small enough that 


dH (F, fo) + 25nH(F, fo) + 2nH (fi + fr, fo) <€. 


Lemma 6.13. There exists a nonzero positive linear functional £ on Coom(G) 
such that €(f) = €(gf) for all g € Gand f € Coom(G). 
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PRooF. For each nonzero f in Ct (G), let S¢ be the closed interval 


com 


[1/H (fo. f), Hf, fo)]. Let S be the compact Hausdorff space 


S= XS}. 
fECcom(G), 
f #0 


A member of S is a function that assigns to each nonzero member f of C4,,(G) 
a real number in the closed interval Sy, and £,(f) is such a function, according 


to Lemma 6.11a. For each open neighborhood V of the identity in G, define 
Ey = {ly | g €CEn(G), p #0, support(y) C V} 


as anonempty subset of S. If V C V’, then Ey C Ey and hence also EG Cc EG,. 
Thus if Vi,..., V, are open neighborhoods of the identity, then 


cl cl cl 
Ey ,n--AV, Cc Ey, MA Ey,. 


Consequently the closed sets E“ have the finite-intersection property. Since S is 
compact, they have nonempty intersection. Let @ be a point of S lying in their 
intersection. For ¢ to be in EY for a particular V means that for each « > 0 and 


each finite set fi,..., fn of nonzero members of Cx,,,(G), there is a nonzero 
in Con(G) with support in V such that 
lL fi) — Lo fil < € forl <j <n. (-) 


On the nonzero functions in C com(G), let us observe the following facts: 


(i) €(f) = O and £( fo) = 1, the latter because €,( fo) = 1 for all g. 

Gi) €(f) = (gf) for g € G, since for any € > 0, |€(f) — l(gf)| < 
eA) — Lo P+ lof) — Lo (ef I+ llol@f) — L(gf)| < 2€ by Lemma 
6.11b if V and ¢ are as in (*) for the two functions f and gf. 

(iti) €(f; + fo) = £(f1) + £(f2) because if « > 0 is given, if V is chosen for 
this € according to Lemma 6.12, and if g is chosen for f,, fo, and f asin 
(x), then we have &(fi + fo) < lp(fi + fr) + < ly (fi) + Lp(fa) +€ 
= €(fi) + £(fo) + 3e and €(f1) + €(fo) = p(s) + £o(fo) + 2e = 
lo(fi + fo) +3€ < (fi + fo) + 4e, the next-to-last inequality holding 
by Lemma 6.12. 

(iv) (cf) = cé(f) for c > O because if V and ¢ are as in (x) fore > 0 
and the two functions f and cf, then we have €(cf) < €g(cf) +e = 
clg(f)te < cl(f)+(c+leandcl(f) < cly(f)+ce = lg(cf)+ce < 
L(cf) t(e+ De. 

Because of (iii) and (iv), @ extends to a linear functional on C.om(G), and this linear 
functional is positive by (i) and satisfies the invariance condition €(f) = ¢(gf) 
by (ii). 


230 VI. Compact and Locally Compact Groups 


PROOF OF EXISTENCE IN THEOREM 6.8. Fix a nonzero function fo inC*,,(G), 
and let yz be the measure given by the Riesz Representation Theorem as corre- 
sponding to the positive linear functional £ in Lemma 6.13. If Ko is a nonempty 
compact Gs and if { f;,} is a decreasing sequence in Ccom(G) with pointwise limit 
Ix,, then we have fi. gfndu = J fndy for all g € G and all n. Passing to 
the limit and applying dominated convergence gives Gg klk, du = fe glk) ab. 
Now glx,(x) = Ix, (g-'x) = Igx,(x), and hence u(gKo) = “(Ko) for all g. 
In other words, the regular Borel measures g~! ju and jz agree on compact G;’s. 
This equality is enough® to force the equality g~!y = w for all g. Finally yw is 
not the 0 measure since Je fodu=1. 


3. Modular Function 


We continue with G as a locally compact group. From now on, we shall often 
denote particular left and right Haar measures on G by d)x and d,x, respectively. 
An important property of left and right Haar measures is that 


any nonempty open set has nonzero Haar measure. 


In fact, in the case of a left Haar measure, if any compact set is given, finitely many 
left translates of the given open set together cover the compact set. If the open set 
had 0 measure, so would its left translates and so would every compact set. Then 
the measure would be identically 0 by regularity. A similar argument applies to 
any right Haar measure. We shall occasionally make use of this property without 
explicit mention. 

Actually, left Haar measure and right Haar measure have the same sets of 
measure 0, as will follow from Proposition 6.15c below. Thus we are completely 
justified in using the expression “nonzero Haar measure” above. 

Fix a left Haar measure d)x. Since left translations on G commute with right 
translations, d;(- g) is a left Haar measure for any g € G. Left Haar measures 
are proportional, and we therefore define the modular function A : G > Rt of 
G by 

di(-g) = A(g7')d(-). 


Lemma 6.14. For any regular Borel measure jz on G, any go in G, and any p 
with | < p < ov, the limit relations 


lime fg |f (gx) — f (gox)|? du(x) = 0 
and limye J fg) — f (go)? du(x) = 0 


Propositions 11.19 and 11.18 of Basic. 
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hold for each f in Ceom(G). In particular, 
gt> fe f(gx) dua) and gt> fg f (xg) du(x) 


are continuous scalar-valued functions for such f. 


PROOF. Corollary 6.7 shows that g t» f(g(-)) is continuous from G into 
C(G). Let € > 0 be given, and choose a neighborhood N of go such that 
sup,<g | f (gx) — f(gox)| < € for g in N. If K is acompact neighborhood of go, 
then the set of products K support(f) is compact, being the continuous image of a 
compact subset of G x G under multiplication. It therefore has finite 4 measure, 
say C. When g is in K 1 N, we have 


Sg \F (gx) — f (gox)|? du(x) < €? w(K support(f)) = Ce?, 


and the first limit relation follows. Taking p = 1, we have 


| Ig flex) dua) — fe f(gox) dutx)| < JG lf (gx) — f(gox)| dus), 


and we have just seen that the right side tends to 0 as g tends to go. This proves 
the first conclusion about continuity of scalar-valued functions. 

__ For the other limit relation and continuity result, we replace f by the function 
f with f(x) = f(x7!), and we apply to f what has just been proved, taking into 
account the continuity of the inversion mapping on G. 


Proposition 6.15. The modular function A for G has the properties that 
(a) A: G > R* isacontinuous group homomorphism, 
(b) A(g) = 1 for g in any compact subgroup of G, 
(c) d)(x~!) and A(x) djx are right Haar measures and are equal, 
(d) d,(x~!) and A(x)~! d,x are left Haar measures and are equal, 
(e) d,(g-) = A(g)d,(-) for any right Haar measure on G. 


PROOF. For (a), we take d(x) = d)x in Lemma 6.14 and see that the function 
gtr fo fg)dix = fg fx) dwg!) = A(g) fg f&) dix is continuous if f 
is in Coom(G). Since there exist functions f in Ceom(G) with te f(x)dx £0, 
g +> A(g) is continuous. The homomorphism property follows from the fact that 
A (hg) dix =di(x(hg)~")=di((xg7 a7!) = ACh) diag!) = A(h)A(g) dix. 

For (b), the image under A of any compact subgroup of G is a compact 
subgroup of R* and hence is {1}. 

In (c), put du(x) = A(x)d)x. This is a regular Borel measure since A is 
continuous by (a). Since A is a homomorphism, we have 


Sg fas) du) = fg fagA@) dx = fe f@)Ag') deg!) 
= fg FMAMA(g Ag) dix 
= [5 FOAM dx = fy fo) duce). 
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Hence d(x) is aright Haar measure. Meanwhile, d;(x~ isa right Haar measure 
because 


So fag) dor") = fe fag) dix = fo f(g'x) dx 
= pt. sae = [at ace. )t 


Thus Theorem 6.8 for right Haar measures implies that d, (x7!) = cA(x) d)x for 
some constant c > 0. Changing x to x~! in this formula, we obtain 


dix = cA(x7!) dj(x7!) = A(x!) A(x) dix = dx. 


Hence c = 1, and (c) is proved. 

For (d) and (e) there is no loss of generality in assuming that dx = d(x!) = 
A(x) dix, in view of (c). Conclusion (d) is immediate from this identity if we 
replace x by x~!. For (e) we have 


Se FO) 4-(gx) = fg Fe x) x= fg Fe x) AG) dix= fy f (a)A(gx) dix 
= A(g) fa SMA) dix = A(g) ie f(x) d-x, 


and we conclude that d,(g-) = A(g)d,(-). 


The locally compact group G is said to be unimodular if every left Haar 
measure is a right Haar measure (and vice versa). In this case we can speak of 
Haar measure on G. 

In view of Proposition 6.15e, G is unimodular if and only if A(t) = 1 for all 
t € G. Locally compact abelian groups are of course unimodular. Proposition 
6.15b shows that compact groups are unimodular. 

Any commutator ghg~'h~! in G is carried to 1 by the modular function A. 
Consequently any group that is generated by commutators, such as SL(N, R), 
is unimodular. More generally any group that is generated by commutators, 
elements of the center, and elements of finite order is unimodular; GL(N, R) is 
an example. 


Theorem 6.16. Let G be a separable locally compact group, and let S and T 
be closed subgroups such that $M T is compact, multiplication $ x T > G is 
an open map, and the set of products ST exhausts G except possibly for a set of 
Haar measure 0. Let Ay and AG denote the modular functions of T and G. Then 
the left Haar measures on G, S, and T can be normalized so that 


= Ar(t) 
[teoas = ee ICOT d)s dit 


for all Borel functions f > 0 on G. 
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REMARK. The assumption of separability avoids all potential problems with 
using Fubini’s Theorem in the course of the proof. Problems 21—22 at the end of 
the chapter give a condition under which multiplication S x T — G is an open 
map, and they provide examples. 


PROOF. Let Q C G be the set of products ST, and let K = SMT. The group 
S x T acts continuously on Q by (s, f)@ = swt~!, and the isotropy subgroup at 1 
is diag K. Thus the map (s, t) +> st~! descends to a map (S x T)/diag K > Q. 
This map is a homeomorphism since multiplication S x T — G is assumed to 
be an open map. 

Hence any Borel measure on © can be reinterpreted as a Borel measure on 
(Sx T)/diag K . We apply this observation to the restriction of a left Haar measure 
dx for G from G to Q, obtaining a Borel measure dy on (S x T)/diag K. On 
Q2, we have 

d(soxty ') = Ag(to) dix, 


and the action unwinds to 


dU((So, to)(s, t)(diag K)) = Ag(to) du((s, t) diag K )) (*) 


on (S x T)/diag K. Using the Riesz Representation Theorem, define a measure 
dji(s,t) on S x T in terms of a positive linear functional on Ceom(S x T) by 


f(s, t) dius, t) = 


(SxT)/diag K 


[ f Fook, re) ak] duc. n(diag K), 
SxT K 
where dk is a Haar measure on K normalized to have total mass 1. From (x) it 
follows that 
dji(Sos, tot) = Ag(to) di(s, t). 
The same proof as for the uniqueness in Theorem 6.8 shows that any two Borel 
measures on S x T with this property are proportional, and Ag(t) djs djt is such 


a measure. Therefore 
dju(s, t) = Ag(t) ds dit 


for a suitable normalization of djs djt. 
The resulting formula is 


/ f(x)dix = i f (st7')Ag(t) djs dit 
Q SxT 


for all Borel functions f > 0 on &. On the right side the change of variables 
t +> t~! makes the right side become 


f (st)Ag(t) | ds Ar(t) dit, 
SxT 
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according to Proposition 6.15c, and we can replace Q by G on the left side since 
the complement of 22 in G has measure 0 by assumption. This completes the 
proof. 


4. Invariant Measures on Quotient Spaces 


If H is aclosed subgroup of G, then we can ask whether G/H has a nonzero G 
invariant Borel measure. Theorem 6.18 below will give a necessary and sufficient 
condition for this existence, but we need some preparation. Fix a left Haar measure 
djh for H. If f is in Coom(G), define 


fig) =f foeinan. 
H 
This function is invariant under right translation by H, and we can define 
fF" (gH) = f"(g). 
The function f** has compact support on G/H. 


Lemma 6.17. The map f +> f** carries Coom(G) onto Coom(G/H), and a 
nonnegative member of Coom(G/H) has a nonnegative preimage in Cegm(G). 


PROOF. Let x : G — G/H be the quotient map. Let F € Coom(G/H) be 
given, and let K te a compact set in G/H with F = 0 off K. We first produce 
a compact set K in G with m(K ) = K. For each coset in K, select an inverse 
image x and let N,. be a compact neighborhood of x in G. Since z is open, 7 of 
the interior of NV, is open. These open sets cover K , and a finite number of them 
suffices. Then we can take K to be the intersection of the closed set ~! (K) with 
the compact union of the finitely many N,’s. 

Next let Ky be a compact neighborhood of 1 in H. Since nonempty open 
sets always have positive Haar measure, the left Haar measure on H i a 
on Ky. Let K’ be the compact set K'= Re so that m(K’ )=2(K) = 
Choose Si € Coom(G) with f; > 0 everywhere and with f; = 1 on Ke If g isin 
K' , then ta fi(gh) dh is > the H measure of Ky, and hence bie is>Oonk. 


Dette ae F(x(g)) 
Tepe * ((g)) 


0) otherwise. 


ifm(g) € K, 


Then f** equals F on K and equals 0 off K, and therefore f** = F everywhere. 
Certainly f has compact support. To see that f is continuous, it suffices to 
check that the two formulas for f(g) fit together continuously at points g of the 
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closed set 2 ~!(K). Itis enough to check points where f(g) 4 0. Say g4 > g for 
a net {g,}. We must have F(z(g)) 4 0. Since F is continuous, F(z(gq)) 4 0 
eventually. Thus for all w sufficiently large, f (gq) is given by the first of the two 
formulas. Thus f is continuous. 


Theorem 6.18. Let G be a locally compact group, let H be a closed subgroup, 
and let Ag and Ay be the respective modular functions. Then a necessary and 
sufficient condition for G/H to have a nonzero G invariant regular Borel measure 
is that the restriction to H of Ag equal Ay. In this case such a measure du(g H) 
is unique up to a scalar, and it can be normalized so that 


[r@dc=] [f reman|auen 
G G/H* JH 
for all f € Coom(G). 


PROOF. Let du(gH) be a nonzero invariant regular Borel measure on G/H. 
Using the function f** defined above, we can define a measure dji(g) on G via 
a linear functional on Ccom(G) by 


/ f(g) di(g) = / ff" (gH) du(gh). 
G G/H 


Since f + f** commutes with left translation by G, dj is a left Haar measure 
on G. By Theorem 6.8, dj is unique up to a scalar; hence dju(gH) is unique up 
to a scalar. 

Under the assumption that G/H has a nonzero invariant Borel measure, we 
have just seen in essence that we can normalize the measure so that the boxed 
formula holds. If we replace f in the boxed formula by f(-/o), then the left 
side is multiplied by Ag (ho), and the right side is multiplied by Ay (ho). Hence 
AG| y = Au is necessary for existence. 

Let us prove that this condition is sufficient for existence. If h in Coom(G/H) 
is given, we can choose f in Cegom(G) by Lemma 6.17 such that f“* = h. Then 
we define L(h) = f. gc f(g) dig. If L is well defined, then it is a linear functional, 
Lemma 6.17 shows that it is positive, and L certainly is the same on a function as 
on its G translates. By the Riesz Representation Theorem, L defines a G invariant 
Borel measure diu(g H) on G/H such that the boxed formula holds. 

Thus all we need to do is see that L is well defined if AG| y= Ay. We are 
thus to prove that if f € Ccom(G) has f* = 0, then fs f(g)dig = 0. Let 
be in Coom(G). Since Fubini’s Theorem is applicable to continuous functions of 
compact support, we have 


0= fo Vi@f"(s) dg 
= folly V@sflgh) dh] dig 
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= ful le We) F (gh) dig] ah 

=fullo veh f(g) dig] Ac(h) dh by definition of Ag 

= fo f@L Sy (gh) Ach) dh] dg 

= fof @| Sy v(gh)Ach)'Anth) dh] dig by Proposition 6.15¢ 

= {6 f(g)w"(@) dig since Ag|, = Au. 
By Lemma 6.17 we can choose w € Ceom(G) such that y*” = 1 on the image in 
G/H of the support of f. Then the right side of the above display is / cg Sg) dag, 


and the conclusion is that this is 0. Thus LZ is well defined, and existence is 
proved. 


EXAMPLE. Let G = SL(2, R), and let H be the upper half plane in C, namely 
{z | Imz > 0}. The group G acts continuously on 1 by linear fractional 
transformations, the action being 


a b _az+b 
€ on cz +d 


This action is transitive since 


Ge pete, 
( 0. yl? Jorxtiy if y > 0, (*) 


and the subgroup that leaves 7 fixed, by direct computation, is the rotation 
cos@ — sind 


subgroup K, which consists of the matrices ( ); The mapping of 


sind cosé 
G to H given by g + g(i) therefore descends to a one-one continuous map 


of G/K onto 7, and Problem 3 at the end of the chapter shows that this map 
is ahomeomorphism. The group G is generated by commutators and hence is 
unimodular, and the subgroup K is unimodular, being compact. Theorem 6.18 
therefore says that 1 has a G-invariant Borel measure that is unique up to a scalar 
factor. Let us see for p = —2 that the measure y? dx dy is invariant under the 
subgroup acting in (+). We have 
/2 re /2 
( 5 yl ) (x + iy) = yo + iy) +.x0 = (vox + x0) tiyoy. (x) 
0 

If y denotes left translation by the matrix on the left in (**), then (dx dy)y-1 = 
yp dx dy. Hence (y~? dx dy)y-1 = (y~?)® (dx dy) g-1 = (yp “y~7) 0 dx dy) = 
y~* dx dy, and y~* dx dy is preserved by every matrix in («*). The group G is 
generated by the matrices in (*) and the one additional matrix e a) Since 


01 Ea 1 _ ax hy 
-1 0 * CD@+i9) Fy?’ 
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(ste) sends y~* dx dy to (win) | det J| dx dy, where J is the Jacobian 


—2xy x“—y 
@2+y22 @2+y2)? 
- a) sends y~* dx dy to 
itself. Consequently y~? dx dy is, up to a multiplicative constant, the one and 
only G-invariant measure on 71. 


xy? 2xy 
. aan 24 y2)2 21 yy2)2 
matrix of F(x, y) = (ore at) namely J = e ee ) Cal- 


culation gives | det J| = (x* + y*)~?, and therefore ( 


5. Convolution and L’? Spaces 


We turn our attention to the way that Haar measure arises in real analysis. This 
section will introduce convolution, and aspects of Fourier analysis in the setting 
of various kinds of locally compact groups will be touched upon in later sections 
and in the problems at the end of that chapter. In most such applications of 
Haar measure to Fourier analysis, one assumes that the group under study is 
unimodular, even if some of its closed subgroups are not. 

Thus let G be a locally compact group. We assume throughout this section 
that G is unimodular. We can then write dx for a two-sided Haar measure on G. 
Proposition 6.15c shows that we have |, f(x~')dx = JG f(x) dx for all Borel 
functions f > 0. We abbreviate L?(G, dx) as L?(G). 


Proposition 6.19. Let G be unimodular, let | < p < oo, and let f be a Borel 
function in L?. Then g +} gf and g } fg are continuous functions from G 
into L?. 


PROOF. Lemma 6.14 gives the result for f in Coom(G). Proposition 11.21 of 
Basic shows that C.om(G) is dense in L?(G). Given gp € G ande > 0, find A in 
Ceom(G) with || f — All, <e. Then 


llgf — sof llp < Ilgf — ghll, + llgh — gohll, + llgoh — sof ll, 
= 2\|f —All, + lg — gohll, by left invariance of dx 
< 2€ + ||gh — gohll,, 


and hence lim sup,.,., llgf — gof|l, < 2¢. Since ¢ is arbitrary, we see that gf 
tends to go f in L?(G) as g tends to go. Similarly fg tends to fgo in L?(G) as 
g tends to go. 


A key tool for real analysis on G is convolution, just as it was with RY. Ona 
formal level the convolution f * h of two functions f and h is 


(f *h)(x) = [ f(xy "')h(y) dy = i f (y)h(y7'x) dy. 
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The formal equality of the two integrals comes about by changing y into y~! in the 


first integral and then replacing xy by y. If G is abelian, then xy~! = y~!x; thus 
the first integral for f *h equals the second integral for h « f, and the conclusion 
is that convolution is commutative. However, convolution is not commutative if 
G is nonabelian. 

To make mathematical sense out of f * h, we adapt the corresponding known 
discussion’ for the special case G = R. Let us begin with the case that f and h 
are nonnegative Borel functions on G. The question is whether f */h is meaningful 
as a Borel function > 0. In fact, (x, y) + f(xy7!) is the composition of the 
continuous function F : G x G — G given by F(x, y) = xy7!, followed by the 
Borel function f : G > [0, +00]. If U is open in [0, +00], then f'(U) is in 
B(G), and an argument like the one for Proposition 6.8 shows that (foF VU) = 
F-'(f-!(U)) is in B(G x G). Then the product (x, y) & f(xy7!)g(y) is a 
Borel function, and we would like to use Fubini’s Theorem to conclude that 
xt (f *h)(x) is a Borel function > 0. Unfortunately we do not know whether 
the o-algebras match properly, specifically whether B(G x G) = B(G) x B(G). 

On the other hand, this kind of product relation does hold for Baire sets. We 
therefore repeat the above argument with nonnegative Baire functions in place of 
nonnegative Borel functions. Now the only possible difficulty comes from the 
fact that Haar measure on G might not be o-finite. This problem is easily handled 
by the same kind of localization argument as with the proof of uniqueness for 
Theorem 6.8: Suppose that G is not o-compact and that f > Ois a Baire function 
on G. If E is any subset of [0, +00], then f-'(E) and f7!(E°) are disjoint 
Baire sets. Since any two Baire sets that fail to be o-bounded have nonempty 
intersection, only one of f —!(E£) and f—!(E°) can fail to be o-bounded. It follows 
that there is exactly one member c of [0, +00] for which f~!(c) is not o-bounded. 
So as to avoid unimportant technicalities, let us assume for all Baire functions 
under discussion that this value is 0, i.e., that each Baire function considered 
in some convolution vanishes off some o-bounded set. Any o-bounded set is 
contained in some o-compact open subgroup Go of G, and thus the convolution 
effectively takes place on the o-compact open subgroup Go; the convolution is 0 
outside Go. 


Proposition 6.20. Suppose that f and h are nonnegative Baire functions on 
G, each vanishing off a o-bounded subset of G. Let 1 < p < o, and let p’ 
be the dual index. Then convolution is finite almost everywhere in the following 
cases, and then the indicated inequalities of norms are satisfied: 

(a) for f in L'(G) and h in L?(G), and then || f * All, < FILMAI,- 
for f in L?(G) and h in L'(G), and then || f * Al, < IF, lAll,. 


7The discussion in question appears in Section VI.2 of Basic. 
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(b) for f in L?(G) and h in L?' (G), and then Il f * Allsup < IF Il, IAll,> 
for f in L? (G) and h in L?(G), and then || f * Allsup < Fl lAll,- 
Consequently f * h is defined in the above situations even if the scalar-valued 


functions f and h are not necessarily > 0, and the estimates on the norm of f *h 
are still valid. In case (b), the function f * h is actually continuous. 


REMARK. The proof of the continuity in (b) will show actually that f * h is 
uniformly continuous in a certain sense. 


PROOF. The argument for measurability has been given above. The argument 
for the norm inequalities is proved in the same way’ as in the special case that 
G = RN. Namely, we use Minkowski’s inequality for integrals to handle (a), 
and we use Hdlder’s inequality to handle (b). 

Now consider the question of continuity in (b). At least one of the indices 
p and p’ is finite. First suppose that p is finite. We observe for g € G that 
a(f#h)(x) = (f#h)(g-'x) = fg f(g xy AQ) dy = fg (sf Gy ')hO) dy 
= (gf) * h(x). Then we use the bound || f « All sup < If ilp IAL, to make the 
estimate, for g € G, that 


lIgCf *h) — (F * MD llsup = ef) *h — f * Alloup 
= If — f) *Alleup S ls f — fIlpllAll- 


Proposition 6.19 shows that the right side tends to 0 as g tends to 1, and hence 
limy .1(f * h)(g~'x) = (f *h)x. If instead p’ is finite, we argue similarly 
with right translations of h, finding first that (f «h)g = f * (hg) and then that 
Il(f x h)g — Cf * A) |lsup < If il,lzg — All,- Application of Proposition 6.19 


therefore shows that lim,-.1(f * h)(xg7!) = (f *h)(x). 


Corollary 6.21. Convolution makes L'(G) into an associative algebra 
(possibly without identity) in such a way that the norm satisfies || f * hl], < 
fll, all, for all f and A in L'(G). 


PROOF. The norm inequality was proved in Proposition 6.20a, and it justifies 
the interchange of integrals in the calculation 


(fi * fo) * fax) = fg Sg AMO! fa(Z"x) dy dz 
= fe to AMA Of @ x) dz dy 
=folo fOA@A@ 'y'x)dzdy underz+> yz 
= (fi * (f2 * f3))@), 


which in turn proves associativity. 


8Propositions 6.14 and 9.10 of Basic. 
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We shall need the following result in proving the Peter-Weyl Theorem in 
Section 7. 


Proposition 6.22. Let G be a compact group, let f be in L'(G), and let h 
be in L°(G). Put F(x) = f f(y)h(y7'x) dy. Then F is the limit in L*(G) 
of a sequence of functions, each of which is a finite linear combination of left 
translates of h. 


REMARK. For a comparable result in R, see Corollary 6.17 of Basic. We 
know from Proposition 6.15b that compact groups are unimodular. 


For the proof we require a lemma. 
Lemma 6.23. Let G be a compact group, and let h be in L?(G). For any 


€ > 0, there exist finitely many y; € G and Borel sets E; C G such that the EF; 
disjointly cover G and 


|n(y'x) — A(y; xl, <€ for alli and for all y € Ej. 


PROOF. By Proposition 6.19 choose an open neighborhood U of 1 such 
that ||h(gx) — h(x)|l,,, < € whenever g is in U. For each zo € G, we have 
|A(gzox) — h(zox)\lp,. < € whenever g is in U. The set Uzo is an open 
neighborhood of zo, and such sets cover G as Zg varies. Find a finite subcover, 
say Uz;,..., UZy, and let U; = Uz;. Define Fj = Uj; — aaa U; forl < j <n. 
Then the lemma follows with y; = z; ' and E; = F os 


PROOF OF PROPOSITION 6.22. Given € > 0, choose y; and FE; as in Lemma 
6.23, and put c; = fi, f(y) dy. Then 
Il fo FOMMOT!2) dy — Li cho; "I, 
< | Xi Se, IF MAG7 x) — AGT" x) ay], , 
<i Se IF OIMAGT!s) — 207 "2b, dy 
< Yi Se lf le dy =ell fll. 


6. Representations of Compact Groups 


The subject of functional analysis always suggests trying to replace a mathe- 
matical problem about functions by a problem about a space of functions and 
working at solving the latter. By way of example, this point of view is what lay 
behind our approach in Section I.2 to certain kinds of boundary-value problems 
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by using the method of separation of variables. In some of the cases of separation 
of variables we considered, as well as in other situations arising in nature, the 
problem has some symmetry to it, and that symmetry gets passed along to the 
space of functions under study. Mathematically the symmetry is captured by a 
group, since the set of symmetries is associative and is closed under composition 
and inversion. The subject of representation theory deals with exploiting such 
symmetry, at least in cases for which the problem about functions is linear. 

We shall begin with a definition and some examples of finite-dimensional rep- 
resentations of an arbitrary topological group, and then we shall develop a certain 
amount of theory of finite-dimensional representations under the assumption that 
the group is compact. The main theorem in this situation is the Peter—-Wey] 
Theorem, which we take up in the next section. In Section 8 we introduce 
infinite-dimensional representations because vector spaces of functions that arise 
in analysis problems are frequently infinite-dimensional; in that section we study 
what happens when the group is compact, but a considerable body of mathematics 
beyond the scope of this book investigates what can happen for a noncompact 
group. 

Historically the original representations that were studied were matrix rep- 
resentations. An N-by-N matrix representation of a topological group G 
is a continuous homomorphism ® of G into the group GL(N, C) of invert- 
ible complex matrices. In other words, ®(g) is an N-by-N invertible com- 
plex matrix for each g in G, the matrices are related by the condition that 
®(gh)i; = ee ®(g)ixP(h)x;, and the functions g +> ®(g);; are continuous. 

Eventually it was realized that sticking to matrices obscures what is really 
happening. For one thing the group GL(N, C) is being applied to the space C” 
of column vectors, and some vector subspaces of C’ seem more important than 
others when they are really not. Instead, it is better to replace C’ by a finite- 
dimensional complex vector space V and consider continuous homomorphisms 
of G into the group GLc(V) of invertible linear transformations on V. Specifying 
an ordered basis of V allows one to identify GLc(V) with GL(N, C), and then 
the homomorphism gets identified with a matrix representation. In the special 
case that V = C’, this identification can be taken to be the usual identification of 
linear functions and matrices. The point, however, is that it is unwise to emphasize 
one particular ordered basis in advance, and it is better to work with a general 
finite-dimensional complex vector space. 

Thus we define a finite-dimensional representation of a topological group 
G on a finite-dimensional complex vector space V to be a continuous homomor- 
phism ® of G into GLc(V). The continuity condition means that in any basis of 
V the matrix entries of ®(g) are continuous for g € G. It is equivalent to say 
that g } ®(g)v is a continuous function from G into V for each v in V, Le., 
that for each v in V, if P(g)v is expanded in terms of a basis of V, then each 


242 VI. Compact and Locally Compact Groups 


entry is a continuous function of g. The vector space V is allowed to be C% in 
the definition, and thus matrix representations are part of the theory. 

Before coming to a list examples, let us dispose of two easy kinds of examples 
that immediately suggest themselves. 

For any G the trivial representation of G on V is the representation ® of G for 
which ®(g) = 1| forall g € G. Sometimes when the term “trivial representation” 
is used, it is understood that V = C; sometimes the case V = C is indicated by 
referring to the “trivial 1-dimensional representation.” 

If G is a group of real or complex invertible N-by-N matrices, then G is 
a subgroup of GL(N, C), and the relative topology from GL(N, C) makes G 
into a topological group. The inclusion mapping ® of G into GL(N, C) is 
a representation known as the standard representation of G. The following 
question then arises: If G is such a group, why consider representations of G 
when we already have one? The answer, from an analyst’s point of view, is that 
representations are thrust on us by some mathematical problem that we want to 
solve, and we have to work with what we are given; other representations than 
the standard one may occur in the process. 


EXAMPLES OF FINITE-DIMENSIONAL REPRESENTATIONS. 


(1) One-dimensional representations. A continuous homomorphism of a topo- 
logical group G into the multiplicative group C* of nonzero complex numbers is 
a representation because we can regard C* as GL(1, C). Of special interest are 
the representations of this kind that take values in the unit circle {e'°}. These are 
called multiplicative characters. 

(a) The exponential functions that arise in Fourier series are examples; the 
group G in this case is the circle group S!, namely the quotient of R modulo the 
subgroup 27 Z of multiples of 27, and for each integer n, the function x +> e'™ 
is a multiplicative character of R that descends to a well-defined multiplicative 
character of S!. 

(b) The exponential functions that arise in the definition of the Fourier 
transform on R™ , namely x +> e'*”, are multiplicative characters of the additive 
group RY. 

(c) Let Jm be the cyclic group {0,1,2,...,m—1} of integers modulo m 
under addition, and let ¢,, = e2t/™ For each integer and for k in J, , the formula 
Xn(k) = (aM defines a multiplicative character x, of J,,,. These multiplicative 
characters are distinct forO <n <m-—1. 

(d) If G is the symmetric group G, on n letters, then the sign mapping 
o +> sgno is a multiplicative character. 

(e) The integer powers of the determinant are multiplicative characters of 
the unitary group U(N). 


(2) Some representations of the symmetric group G3 on three letters. 
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(a) The trivial character and the sign character defined in Example 1d above 
are the only multiplicative characters. 

(b) For each permutation o, let ®(o) be the 3-by-3 matrix of the 
linear transformation carrying the standard ordered basis (e1, 2, e3) of C3 to 
the ordered basis (€¢(1), €5(2), @0(3)). To check that ® is indeed a representa- 
tion, we start from P(o)e; = eo:;); applying P(t) to both sides, we obtain 
P(t) P(a)e; = P(t)eo(j) = er(o(j)) = C(roy(j) = P(ta)e;, and we conclude 
that P(t) P(o) = P(t). The vector e; + e2 + e3 is fixed by each ®(c), and 
therefore the 1-dimensional vector subspace C(e, + e2 + e3) is “invariant” in the 
sense of being carried to itself under ®(G3). 

(c) Place an equilateral triangle in the plane R? with its center at the origin 
and with vertices given in polar coordinates by (r, 0) = (1,0), (1, 27/3), and 
(1, 47/3). Let the vertices be numbered 1, 2,3, and let ®(o) be the matrix of 
the linear transformation carrying vertex j to vertex o(j) for each j. Then ® is 
given on the transpositions (1. 2) and(2 3) by 


O((1 ea. Me and = @((2 sn = (4 =) 


and is given on any product of these two transpositions by the corresponding 
product of the above two matrices. The eigenspaces for ®((2 3) are Ce, and 
Ce, and these subspaces are not eigenspaces for ®((1 2)). Consequently the 
only vector subspaces carried to themselves by ®(G3) are the trivial ones, namely 
0 and C*. The functions on 63 of the form a > ®(o);; will play a role similar 
to the role of the functions x ++ e!”* in Fourier series, and we record their values 
here: 
ae c  O6)1 O86) %@)n O6)2 
(1) 1 0 0 1 
(123) -1/2 —/3/2 3/2 -1/2 
(132) 1/2 3/2 —73/2 —-1/2 
(12) -1/2 3/2 /3/2 1/2 
(23) 1 0 0 —1 
(13) -1/2 —V3/2 —/3/2 1/2 
(3) A family of representations of the unitary group G = U(N). Let V 
consist of all polynomials in z;,..., Zy,Z1,---,Zy homogeneous of degree k, 
i.e., having every monomial of total degree k, and let 


Z1 Zy ZI Fal 
(g)P a oe SP a tl ee bee 


ZN ZN ZN ZN 


The vector subspace V’ of holomorphic polynomials (those with no Z’s) is carried 
to itself by all ®(g), and therefore V’ is an invariant subspace in the sense of 
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being carried to itself by ®(G). The restriction of the ®(g)’s to V’ is thus itself 
a representation. When k = 1, this representation on V’ may at first seem to be 
the standard representation of U(N), but it is not. In fact, V’ for k = 1 consists 
of all linear combinations of the N linear functionals 


yal ZI 
> Z] through : > ZN. 
ZN ZN 


In other words, V’ is actually the space of all linear functionals on C’. The 
definition of ® by ®(g)£(z) = €(g7!z) for z € C% and for ¢ in the space of 
linear functionals involves no choice of basis. The representation on V’ when 
N = 1 is the “contragredient” of the standard representation, in a sense that will 
be defined for any representation in Example 6 below. 


(4) A family of representations of the special unitary group G = SU(2) of 
G5) with 
|u|? + |B|* = 1. Let V be the space of homogeneous holomorphic polynomials 
of degree n in z; and Z2, let ® be the representation defined in the same way as in 
Example 3, and let V’ be the space of all holomorphic polynomials in z of degree 


n with ic B 
; a £B <egp az — 


Define E : V > V’' by (EP)(z) = P Gas Then £ is an invertible linear 


mapping and satisfies E®(g) = &’(g)E for all g, and we say that E exhibits ® 
and ®’ as equivalent (i.e., isomorphic). 


all 2-by-2 unitary matrices of determinant 1, namely all matrices 


(5) A family of representations for G equal to the orthogonal group O(N) or 
the rotation subgroup SO(N). Let V consist of all polynomials in x1,..., xy 
homogeneous of degree k, and let 

X1 X1 
®(g)P{ | 2 |) =P is" 
XN XN 
Then © is a representation. When we want to emphasize the degree, let us write 
®, and V;. Define the Laplacian operator as usual by 
a° 0° 


— | i 
~ 4,2 
Ox; 


eet ee 
This carries V; to Vz_2, and one checks easily that it satisfies A®;,(g) = 
@;,_2(g)A. This commutativity property implies that the kernel of A is an 
invariant subspace of V;, the space of homogeneous harmonic polynomials 
of degree k. 


6. Representations of Compact Groups 245 


(6) Contragredient representation. Let G be any topological group, and let 
® be a finite-dimensional representation of G on the complex vector space V. 
The contragredient of ® is the representation ®° of G on the space of all linear 
functionals on V defined by (®°(g)£)(v) = £(@(g7!)v) for any linear functional 
and any vin V. 


Having given a number of examples, let us return to a general topological 
group G. An important equivalent definition of finite-dimensional representation 
is that ® is a continuous group action of G ona finite-dimensional complex vector 
space V by linear transformations. In this case the assertion about continuity is 
that the map G x V — V is continuous jointly, rather than continuous only as a 
function of the first variable. 

Let us deduce the joint continuity from continuity in the first variable. To do 
SO, it is enough to verify continuity of G x V — V at g = 1 andv = 0. Let 
dimc V = N. The topology on V is obtained, as was spelled out above, by 
choosing an ordered basis and identifying V with C%. The resulting topology 
makes V into a topological vector space, and the topology does not depend on the 
choice of ordered basis; the independence of basis follows from the fact that every 
linear mapping on C% is continuous. Thus we fix an ordered basis (v;,..., vy) 
and regard the map {oi} pre yo , Civ; as a homeomorphism of C¥% onto V. 
Put | cu] = (2, lei 2)"”. Given € > 0, choose for each i between 1 
and N a neighborhood U; of 1 in G such that || ®(g)v; — v;|| < 1 for g € U;. If 
gisin (ahs U; and if v = °; civ; has |lv|| < €, then 


IP(g)ull < [©(@)(Leivi) — (Livi) | + ell 
< Vleilll@(g)ui — vill + lull 
< (> [Cj 2)? wi? + |lvl by the Schwarz inequality 
< (N? + De. 


This proves the joint continuity at (g,v) = (1,0), and the joint continuity 
everywhere follows by translation in the two variables separately. 

A representation on a nonzero finite-dimensional complex vector space V 
is irreducible if it has no invariant subspaces other than 0 and V. Every 
1-dimensional representation is irreducible, and we observed that Example 2c 
is irreducible. We observed also that Examples 2b and 3 are not irreducible. 

A representation ® on the finite-dimensional complex vector space V is called 
unitary if an inner product, always assumed Hermitian, has been specified for V 
and if each ®(g) is unitary relative to that inner product (i.e.,has P(g)* P(g) = 1 
and hence ®(g)* = ®(g)7! for all g € G). On the level of the inner product for 
V, a unitary representation has the property that (P(g)u, v) = (u, P(g)*v) = 
(u, ®(g)~'v) = (u, O(g—')v). 
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The question of whether a representation is unitary is important for analysis 
because it gets at the notion of exploiting symmetries by using representation 
theory. Specifically for a unitary representation the orthogonal complement U+ 
of an invariant vector subspace U is an invariant subspace because 


(@(g)ut,u) = (ut, ®(g!)u) € (ut, U) =0 ~~ forut Ee Ut, weU. 


Thus when an analysis problem leads us to a unitary representation and we locate 
an invariant vector subspace, the orthogonal complement will be an invariant 
vector subspace also. In this way the analysis problem may have been subdivided 
into two simpler problems. 

Now let us suppose that the topological group G is compact. One of the critical 
properties of such a group for representation theory is that G has, up to a scalar 
multiple, a unique two-sided Haar measure, i.e., a nonzero regular Borel measure 
that is invariant under all left and right translations. This result was proved in 
Theorem 6.8 and Proposition 6.15b. Let us normalize this Haar measure so 
that it has total measure 1. Since the normalized measure is unambiguous, we 
usually write integrals with respect to normalized Haar measure by expressions 
like f, cg J (x) dx, dropping any name like j from the notation. Also, we write 
L'(G) and L?(G) in place of L'(G, dx) and L?(G, dx). 

We shall want to use convolution of functions on G, and we therefore need 
to confront the technical problem that the measurability in Fubini’s Theorem can 
break down with Borel measurable functions if G is not separable. For this reason 
we shall stick to Baire measurable functions, where no such difficulty occurs.’ 
In particular the spaces L'(G) and L7(G) will be understood to have the Baire 
sets as the relevant o-algebras.!° 

The prototypes for the theory with G compact are the cases that G is the circle 
group S! and that G is a finite group, such as the symmetric group 63. The Haar 
measure is a= dx in the first case, where this time we retain the convention that 
dx is Lebesgue measure. The Haar measure is 7 times the counting measure in 


the second case, the 7 having the effect of making the total measure be 1. 


Proposition 6.24. If ® is a representation of a compact group G on a finite- 
dimensional complex vector space V, then V admits an inner product such that 
® is unitary. 


Corollary 11.16 of Basic shows that every continuous function of compact support on a locally 
compact Hausdorff space is Baire measurable. 

!0Problem 3 at the end of Chapter XI of Basic shows for any regular Borel measure on a compact 
Hausdorff space that every Borel measurable function can be adjusted on a Borel set of measure 0 to 
be Baire measurable. Consequently the spaces L'(G) and L?(G) as Banach spaces are unaffected 
by specifying Baire measurability rather than Borel measurability if the Borel measure is regular. 
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PROOF. Let (-, -) be any Hermitian inner product on V, and define 
(u,v) = fg (P@)u, B(x)v) dx. 
It is straightforward to see that (-, -) has the required properties. 


Corollary 6.25. If ® is a representation of a compact group G on a finite- 
dimensional complex vector space V, then ® is the direct sum of irreducible 
representations. In other words, V = V; ®--- ® Vx, with each V; an invariant 
vector subspace on which ® acts irreducibly. 


REMARK. The “direct-sum” notation V = V; ®--- @ Vy means that each 
element of V has a unique expansion as a linear combination of k vectors, one 


from each V;. If G is the noncompact group of all complex matrices . i) , then 


the standard representation of G on C? has Ce; as an invariant subspace, but there 
is no other invariant subspace V’ such that C? = Ce, ® V’. Thus the corollary 
breaks down if the hypothesis of compactness is dropped completely. 


PROOF. Form (-, -) as in Proposition 6.24. Find an invariant subspace U # 
0 of minimal dimension and take its orthogonal complement U+. Since the 
representation is unitary relative to(-, -),U+is aninvariant subspace. Repeating 
the argument with U+ and iterating, we obtain the required decomposition. 


Proposition 6.26 (Schur’s Lemma, part 1). Suppose that ® and ®’ are ir- 
reducible representations of a compact group G on finite-dimensional complex 
vector spaces V and V’, respectively. If L : V > V’ is a linear map such that 
®'(g)L = L®(g) for all g € G, then L is one-one onto or L = 0. 


PROOF. We see easily that ker L and image L are invariant subspaces of V and 
V', respectively, and then the only possibilities are the ones listed. 


Corollary 6.27 (Schur’s Lemma, part 2). Suppose © is an irreducible repre- 
sentation of a compact group G on a finite-dimensional complex vector space V . 
If L : V > V isa linear map such that ®(g)L = L®(g) for all g € G, then L 
is scalar. 


REMARK. This is the first place where we make use of the fact that the scalars 
are complex, not real. 


PROOF. Let 4 be an eigenvalue of L. Then L — AJ is not one-one onto, but it 
does commute with ®(g) for all g €¢ G. By Proposition 6.26, L — AJ = 0. 
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Corollary 6.28. Every irreducible finite-dimensional representation of a com- 
pact abelian group G is given, up to equivalence, by a multiplicative character. 


ProoF. If G is abelian and © is irreducible, we apply Corollary 6.27 with 
L = ®(gq) and see that ®(gq) is scalar. All the members of ®(G) are therefore 
scalar, and every vector subspace is invariant. For irreducibility the representation 
must then be 1-dimensional. Fixing a basis {v} of the 1-dimensional vector 
space and forming the corresponding 1-by-1 matrices, we obtain a multiplicative 
character. 


EXAMPLE la, continued. For the circle group S' = R / 21 Z, we observed that 
we obtain a family of multiplicative characters parametrized by the integers, the 
n™ such character being 

xpe™, 


The corresponding 1-dimensional representation is x +> multiplication by e’”*. 
In the next corollary we shall prove that the multiplicative characters are orthogo- 
nal in L?(S') in the same sense that the exponential functions are orthogonal. The 
known completeness of the orthonormal system of exponential functions therefore 
gives a proof, though not the simplest proof, that the exponential functions are 
the only multiplicative characters of S'. A simpler proof can be constructed via 
real-variable theory by making direct use of the multiplicative property and the 
continuity. 


EXAMPLES 2a and 2c, continued. We noted that the trivial character and the sign 
character are the only multiplicative characters of G3. These are the following 
two functions of 0 € G3: 


o ®=1 O=sign 


(1) 1 1 
(123) 1 1 
(132) 1 1 
(12) 1 =i 
(23) 1 =| 
(13) 1 =1 


For this example the corollary below will say that these two functions on G3, 
together with the four functions listed earlier for Example 2c, form an orthogonal 
set of six functions. They are not quite orthonormal since the four functions f 


listed earlier have || f ||, = Ji relative to the normalized counting measure. The 


interpretation of Ji is that its square is the reciprocal of the dimension of the 
underlying vector space. 
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Corollary 6.29 (Schur orthogonality relations). 


(a) Let ® and ®’ be inequivalent irreducible unitary representations of a com- 
pact group G on finite-dimensional complex vector spaces V and V’, respectively, 
and let the understood inner products be denoted by (-, -). Then 


/ (P(x)u, v)(P'(x)u’, v’)dx =0 for allu,v € V andu’,v’ € V. 
@ 


(b) Let ® be an irreducible unitary representation on a finite-dimensional 
complex vector space V, and let the understood inner product be denoted by 
(-, +). Then 


[oom ja, dea OY eenanmeV 
G dim V 

REMARK. The proof of (b) will make use of the notion of the “trace” of a square 
matrix or of a linear map from a finite-dimensional vector space V to itself. For 
an n-by-n square matrix A the trace is the sum of the diagonal entries. This is 
(—1)""! times the coefficient of ”~! in the polynomial det(A — 41). Because 
of the multiplicative property of the determinant, this polynomial is the same for 
A as for BAB™' if B is invertible. Hence A and BAB™! have the same trace. 
Then it follows that the trace Tr Z of a linear map L from V to itself is well 
defined as the trace of the matrix of the linear map relative to any basis. For 
further background about the trace, see Section II.5. 


PROOF. (a) Let] : V’ > V be any linear map, and form the linear map 
L= [, P(x) (x!) dx. 


(This integration can be regarded as occurring for matrix-valued functions and 
is to be handled entry-by-entry.) Because of the left invariance of dx, we obtain 
®(y)L&'(y~!) = L, so that ®(y)L = L®'(y) for all y € G. By Proposition 
6.26 and the assumed inequivalence, L = 0. Thus (Lv’, v) = 0. For the particular 
choice of / as 1(w’) = (w’, u’)u, we have 


0 = (Lv, v) = fg (Ox) ®' (x! )v’, v) dx 
= fe (PAO (|v, wu, v) dx = fg (®@)u, v)(®'(a!)u', uw’) dx, 


and (a) results since (®/(x~!)v’, uw’) = (®’(x)u’, v’). 
(b) We proceed in the same way, starting from/ : V — V, and obtain L = AJ 
from Corollary 6.27. Taking the trace of both sides, we find that 


AdimV = Tr L=Tr J, 
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so that A = (Tr 1) / dim V. Thus 


BS ow 
U2, Uj) = dim V U1, U2). 
Choose /(w) = (w, u2)u,, so that Tr 7 = (uj, ua). Then 


(uw1,U2)(V1, v2) Trl 
dim V ~ dim V 
= is (D(x)(@™!)v», ur)u4, v1) dx = ie (O(x)uy, v1) (®(x~!) vn, uo) dx, 


(v1, v2) = (Lv2, 01) = fg (PIP (x“!Jv2, v1) dx 


and (b) results since (®(x~!)u2, ur) = (B(x), v2). 


We can interpret Corollary 6.29 as follows. Let {®@} be a maximal set 
of mutually inequivalent finite-dimensional irreducible unitary representations 
of the compact group G. For each ©), choose an orthonormal basis for the 
underlying vector space, and let o' (x) be the matrix of 6“ (x) in this basis. 
Then the functions {OO Ohi, ja form an orthogonal set in the space L?(G) of 
square integrable functions on G. In fact, if d“ denotes the degree of 6 
(i.e., the dimension of the underlying vector space), then {(d)!/ OO (x)}i, jew 
is an orthonormal set in L?7(G). The Peter—-Wey] Theorem in the next section will 
generalize Parseval’s Theorem in the subject of Fourier series by showing that 
this orthonormal set is an orthonormal basis. 

We can use Schur orthogonality to get a qualitative idea of the decomposi- 
tion into irreducible representations in Corollary 6.25 when © is a given finite- 
dimensional representation of the compact group G. By Proposition 6.24 there 
is no loss of generality in assuming that ® is unitary. If ® is a unitary finite- 
dimensional representation of G, a matrix coefficient of ® is any function on G 
of the form (®(x)u, v). The character or group character of ® is the function 


X@(x) = Tr (x) = DY (@@)uj, uj), 


where {u;} is an orthonormal basis. This function depends only on the equivalence 
class of ® and satisfies 


Xe(exge') = Xo(*) for all g,x EG. 
If ® is the direct sum of representations ®;,..., ®,, then 
Xo = Xo, + °° + Xae,- 


Any multiplicative character is the group character of the corresponding 
1-dimensional representation. 
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EXAMPLE 4, continued. Characters for SU(2). Let ®, be the representation of 
SU(2) on the homogeneous holomorphic polynomials of degree n in z; and zz. A 
basis for V consists of the monomials Fara for0 < k <n, and we easily check 
that ® of the diagonal matrix tg = diag(e’’, ei?) has gion as an eigenvector 
with eigenvalue e!"—*)°, Therefore 


Xo, (to) = Tr By (to) = ef"? + ef -2 4... $e in8. 


Every element of SU(2) is conjugate to some matrix fg, and therefore this formula 
determines Xo, On all of SU(2). 


Corollary 6.30. If G is acompact group, then the character x of an irreducible 
finite-dimensional representation has L* norm satisfying || x |, = 1. If x and x’ 
are characters of inequivalent irreducible finite-dimensional representations, then 
Ig XOX) dx =0. 


PROOF. These formulas are immediate from Corollary 6.29 since characters 
are sums of matrix coefficients. 


Now let ® be a given finite-dimensional representation of G, and write ® as the 
direct sum of irreducible representations ®;,..., ®,. If t is an irreducible finite- 
dimensional representation of G, then the sum formula for characters, together 
with Corollary 6.30, shows that 1, G Xo (x) xX, («) dx is the number of summands 
®; equivalent to t. Evidently this integer is independent of the decomposition of 
® into irreducible representations. It is called the multiplicity of t in ®. 


7. Peter-Weyl Theorem 


The goal of this section is to extend Parseval’s Theorem for the circle group 
S'=R fe 2 Z to a theorem valid for all compact groups. The extension is the 
Peter-Weyl Theorem. We continue with the notation of the previous section, 
letting G be the group, dx be a two-sided Haar measure normalized to have 
total measure one, and, in cases when G is not separable, working with Baire 
measurable functions rather than Borel measurable functions. 

For S!, we observed in Corollary 6.28 that the irreducible finite-dimensional 
representations are 1-dimensional, hence are given by multiplicative characters. 
The exponential functions x +> e’”* are examples of multiplicative characters, 
and it is an exercise in real-variable theory, not hard, to prove that there are no 
other examples. The matrix coefficients of the 1-dimensional representations 
are just the same exponential functions x +> e’”*. The Peter-Weyl Theorem 
specialized to this group says that the vector space of finite linear combinations 
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of exponential functions is dense in L?(S'); the statement is a version of Fejér’s 
Theorem for L? but without the precise detail of Fejér’s Theorem. In view of 
the known orthogonality of the exponential functions, an equivalent formulation 
of the result for S! is that {e’”*}© _.. is a maximal orthonormal set in L?(S'). 
By Hilbert-space theory, {e'"*}°°_., is an orthonormal basis of L7(S'). For 
general compact G, the Peter-Wey]! Theorem asserts that the vector space of finite 
linear combinations of all matrix coefficients of all irreducible finite-dimensional 
representations is again dense in L*(G). The new ingredient is that we must allow 
irreducible representations of dimension > 1; indeed, examination of the group 
63 shows that the 1-dimensional representations are not enough. An equivalent 
formulation in terms of orthonormal bases will be given in Corollary 6.32 below 


and will use Schur orthogonality (Corollary 6.29). 


Theorem 6.31 (Peter-Weyl Theorem). If G is a compact group, then the 
linear span of all matrix coefficients for all finite-dimensional irreducible unitary 
representations of G is dense in L?(G). 


PROOF. If h(x) = (®(x)u, v) is such a matrix coefficient, then the following 
functions of x are also matrix coefficients for the same representation: 


h(x—!) = (®(x)v, uv), 
h(gx) = (®(x)u, ®(g~')v), 
h(xg) = (P(x) ®(g)u, v). 


Then the closure U in L?(G) of the linear span of all matrix coefficients of 
all finite-dimensional irreducible unitary representations is stable under the map 
h(x) +> h(x!) and under left and right translation. Arguing by contradiction, 
suppose that U # L?(G). Then U+ 4 0, and U~ is closed under h(x) & h(x-!) 
and under left and right translation. 

We first prove that there is a nonzero continuous function in U+. Thus let 
H+zAObeinU +. For each open neighborhood WN of 1 that is a Gs, we define 


fy) = py Un * A(x) = ay Sg In) (|) dy, 


where Jy is the indicator function of N and |N| is the Haar measure of NV. 
Since Jy and H are in L?(G), Proposition 6.20 shows that fy is continuous. As 
N shrinks to {1}, the functions fy tend to H in L? by the usual approximate- 
identity argument; hence some fy is not 0. Finally each linear combination of 
left translates of H is in Ut, and fy is therefore in U+ by Proposition 6.22. 

Thus U+ contains a nonzero continuous function. Using translations and 
scalar multiplications, we can adjust this function so that it becomes a continuous 
function F; in U+ with F(1) real and nonzero. Set 


F(x) = fg Figxy') dy. 
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Then F» is continuous, F)(gxg~!) = Fo(x) for all g © G, and F(1) = F,(1) 
is real and nonzero. To see that F) is in Ut, we argue as follows: Corollary 
6.7 shows that the map (g, g’) > F)(g(-)g’) is continuous from G x G into 
C(G), and hence the restriction y > F)(y(-)y7!) is continuous from G into 
C(G). The domain is compact, and therefore the image is compact, hence totally 
bounded. Consequently if € > 0 is given, then there exist yj, ..., y, such that 
each y € G has some yj; such that IFi(y(-)y7!) — Fy(yj(-)¥F Dllsup <¢. Let 
E; be the subset of y’s such that j is the first index for which this happens, and 
let |£;| be its Haar measure. Then 


l fe Figxy dy — DY, |Ejl Fijxy; "| 
=| fe, FiOxy7) — Foyxy7 dy! 
< Dj Se, Foxy) —-Fioyxy; idy < Dye fe, dy = 6, 
and we see that F> is the uniform limit of finite linear combinations of group 
conjugates of F;. Each such finite linear combination is in U + and hence F» is 
inU+, 
Finally put 


F(x) = Fo(x) + Fo@!). 


Then F is continuous and is in Ut, F(gxg~!) = F(x) forall g € G, F(1) = 
2F>(1) is real and nonzero, and F(x) = F(x~!). In particular, F is not the 0 
function in L?(G). 

Form the continuous function K (x, y) = F(x~!y) and the integral operator 


Th(x)=/oK@.yfody=fgF@'yfo)dy — for f ¢ L(G). 


Then K(x, y) = K(y,x) and ol ie |K(x, y)|?dx dy < oo. Also, T is not 0 
since F # 0. The Hilbert-Schmidt Theorem (Theorem 2.4) applies to T as a 
linear operator from L?(G) to itself, and there must be a real nonzero eigenvalue 
2, the corresponding eigenspace V, C L?(G) being finite dimensional. 

Let us see that the subspace V, is invariant under left translation by g, which 
we write as (L(g) f)(x) = f(g7!x). In fact, f in V), implies 


TL(g)f ) = fg Fay) F (gy) dy = fg Fx! ay) f O) dy 
= Tf (gtx) =Af(g'x) =AL(g) f@). 
By Proposition 6.19, g +> L(g) f is continuous from G into L*(G), and therefore 


L is a representation of G in the finite-dimensional space V,. By dimensionality, 
V,, contains an irreducible invariant subspace W, 4 0. 
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Let (f1,..-, fn) be an ordered orthonormal basis of W,. The matrix coeffi- 
cients for W, are the functions 


hij) = (LO) fj, fd) = fg fh) AO) dy 
and by definition are in U. Since F is in Ut, we have 
O= f File) dx = fg fg FO) fi@!y) fily) dy dx 
= fo fg F@ Ay) fily) dx dy 
= fo Sg FOx YACOAG) dx dy 
= fo Lg Fa fiO)dy] fi@) dx ance FG) SFG) 
= fg (TA@IFiG)dx =A fg fi)? dx 


for all i, in contradiction to the fact that W, 4 0. We conclude that U + =O and 
therefore that U = L?(G). 


Corollary 6.32. If {®“} is a maximal set of mutually inequivalent finite- 
dimensional irreducible unitary representations of a compact group G and if 
{(d™)/ Ao (x)}i, ja is a corresponding orthonormal set of matrix coefficients, 


then {dO (x)}i. ja is an orthonormal basis of L?(G). Consequently any 
f in L*(G) has the property that 


MESS: al EOO ve, 


a i,j 
where (-, -) is the L* inner product. 


REMARK. The displayed formula, which extends Parseval’s Theorem from S! 
to the compact group G, is called the Plancherel formula for G. 


PROOF. The linear span of the orthonormal set in question equals the linear 
span of all matrix coefficients for all finite-dimensional irreducible unitary rep- 
resentations of G. Theorem 6.31 implies that the orthonormal set is maximal. 
Hilbert-space theory then shows that the orthonormal set is an orthonormal basis 
and that Parseval’s equality holds, and the latter fact yields the corollary. 


As is implicit in the proof of Corollary 6.32, the partial sums in the expansion of 
f in terms of the orthonormal set of normalized matrix coefficients are converging 
to f in L?(G). The next result along these lines gives an analog of Fejér’s Theorem 
for Fourier series of continuous functions. Taking a cue from the theory of Fourier 
series, let us refer to any finite linear combination of the functions oi (x) in the 
above corollary as a trigonometric polynomial. , 
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Corollary 6.33 (Approximation Theorem). There exists anet T of uniformly 
bounded linear operators from C(G) into itself such that for every f in C(G), 
Tf is a trigonometric polynomial for each 6 and ling Tf = f uniformly 
onG. 


PROOF. The directed set will consist of pairs 6B = (N, €), where N is an open 
Gs containing the identity of G and where 1 > € > O, and the partial ordering 
is that (V,¢€) < (N’,<’) if N D N ande > e’. If B = (N,€) is given, let 
|N| be the Haar measure of N, and let yy = |N|~!Jy be the positive multiple 
of the indicator function of N that makes yy have ||yy||; = 1. Since wy is 
in L?(G), Theorem 6.31 shows that we can find a trigonometric polynomial gg 
such that || — @gll, < €. The operator T®) will be given by convolution: 
TOF = op x f. 

Since ||yy — ggll, < llWn — pil. < € < 1, we have ||gg||, < 2. Therefore 
the operator norm of T®) on C(G) is < 2. 

To see that Tf converges uniformly to f, we use a variant of a familiar 
argument with approximate identities. We write 


ITF — Ff llup < Ip — Ww) * fllsup + lew * f — Flleup- 
The first term on the right is < |lgg — WwllqIIfllsup < ls — ¥wilollfllsup < 
E€|| f I ea For the second term we have 
len * f(x) — F@I =| fg WwONLF Ox) — fF) dy] 
< fe VNOIF Ox) — f@I dy 
INI fy lf v7 'x) — FO) dy 
sup | f(y~'x) — FDI, 


yeN 


II 


IA 


and Proposition 6.6 shows that this expression tends to 0 as N shrinks to {1}. 

Finally we show that Tf is a trigonometric polynomial, i.e., that there are 
only finitely many irreducible representations ®, up to equivalence, such that the 
L? inner product (Tf, &;;) can be nonzero. This inner product is equal to 


Jo Gp * PBA) dx = [fog valey-) f(V)B OD dx dy 
= |fexg PBA) F W)Pij @y) dx dy 
= Vi Song PROS (Y) Pi) Pei) dx dy 
= Yi Sg FOVPGO) [Sg 98) Pix) dx] dy, 


and Schur orthogonality (Corollary 6.29) shows that the expression in brackets 
is 0 unless ® is equivalent to one of the irreducible representations whose matrix 
coefficients contribute to gp. 
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8. Fourier Analysis Using Compact Groups 


In the discussion of the representation theory of compact groups in the previous 
two sections, all the representations were finite dimensional. A number of appli- 
cations of compact groups to analysis, however, involve naturally arising infinite- 
dimensional representations, and a theory of such representations is needed. We 
address this problem now, and we illustrate how the theory of infinite-dimensional 
representations can be used to simplify analysis problems having a compact group 
of symmetries. 

We continue with the notation of the previous two sections, letting G be the 
compact group and dx be a two-sided Haar measure normalized to have total 
measure one. In cases in which G is not separable, we work with Baire measurable 
functions rather than Borel measurable functions. 

Recall from Section II.4 and Proposition 2.6 that if V is a complex Hilbert 
space with inner product (-, -) and norm || - ||, then a unitary operator U on V is 
a bounded linear operator from V into itself such that U* is a two-sided inverse 
of U, or equivalently is a linear operator from V to itself that preserves norms 
and is onto V, or equivalently is a linear operator from V to itself that preserves 
inner products and is onto V. 

From the definition the unitary operators on V form a group. Unlike what 
happens with the N-by-N unitary group U(N), this group is not compact if V 
is infinite-dimensional. A unitary representation of G on the complex Hilbert 
space V is a homomorphism of G into the group of unitary operators on V such 
that a certain continuity property holds. Continuity is a more subtle matter in the 
present context than it was in the finite-dimensional case because not all possible 
definitions of continuity are equivalent here. The continuity property we choose 
is that the group action G x V — V, given by g xX ut» ®(g)v, is continuous. 
When © is unitary, this property is equivalent to strong continuity, namely that 
gt» ®(g)v is continuous for every v in V. 

Let us see this equivalence. Strong continuity results from fixing the V variable 
in the definition of continuity of the group action, and therefore continuity of 
the group action implies strong continuity. In the reverse direction the triangle 
inequality and the equality ||®(g)|| = 1 give 


| P(g)v — B(go) voll < P(g) — voll + IOs) v0 — P(Bo) voll 
= |lv — voll + 1 ®(g)v0 — P(go) voll, 


and it follows that strong continuity implies continuity of the group action. 
With this definition of continuity in place, an example of a unitary repre- 

sentation is the left-regular representation of G on the complex Hilbert space 

L?(G), given by U(g) f)@) = f(g7'!x). Strong continuity is satisfied according 
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to Proposition 6.19. The right-regular representation of G on L*(G), given by 
(r(g) f)(«) = f (xg), also satisfies this continuity property. 

In working with a unitary representation ® of G on V, it is helpful to define 
®(f) for f in L'(G) as a smeared-out version of the various ®(x)’s for x in 
G. Formally ®(f) is to be J cg f (x) P(x) dx. But to avoid integrating functions 
whose values are in an infinite-dimensional space, we define ®(f) as follows: 
The function fe f(x)(®(x)v, v’) dx of v and v’ is linear in v, conjugate linear 
in v’, and bounded in the sense that | [7 f(x)(®(x)u, v’) dx| < If ll, \lullilv'Il- 
Hilbert-space theory shows as a consequence!! that there exists a unique linear 
operator ®(f) such that 


(®(f)v, v’') = ik f (x)(®(x)v, v') dx for all v and v’ in V 
G 


and that this operator is bounded with 


ICAI < IF lh. 


From the existence and uniqueness of ®(f), it follows that &( f) depends linearly 
on f. 

Let us digress for a moment to consider ®(f) if ® happens to be finite- 
dimensional. If {u;} is an ordered orthonormal basis of the underlying finite- 
dimensional vector space, then the matrix corresponding to ®(f) in this basis 
has (i, j)" entry (®(f)uj, uj) = a f()(®(@)u;, uj) dx. The expression 

2 
Yi MONui.upP? =i; | fg fO)(@)ui, uj) dx| 
is, on the one hand, the kind of term that appears in the Plancherel formula in 
Corollary 6.32 and, on the other hand, is what in Section II.5 was called the 
Hilbert-Schmidt norm squared || ®(f) re of ®(f). It has to be independent of 
the basis here in order to yield consistent formulas as we change orthonormal 
bases, and that independence of basis was proved in Section II.5. Using the 


Hilbert—Schmidt norm, we can rewrite the Plancherel formula in Corollary 6.32 
as 


IFIP =o da llO Hlis- 


Unlike the formula in Corollary 6.32, this formula is canonical, not depending on 
any choice of bases. 


'I See the remarks near the beginning of Section XII.3 of Basic. 
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Returning from our digression, let us again allow ©® to be infinite-dimensional. 


The mapping f +> ®(f) for f in L'(G) has two other properties of note. The 
first is that 


O(f)* = O(f"*), 


where f*(x) = f(x7!). To prove this formula, we simply write everything out: 


(®(f)*v, v') = (v, O(f)v’) = fg v, F&)P@)v’) dx 
= fo f@, O@)v) dx = fg fav, O@|)v’) dx 
= fg F*@)(O(x)v, v’) dx = (O(f*)v, v’). 


The other property concerns convolution and is that 
O(f xh) = O(f) P(A). 
The formal computation to prove this is 


O(f *h) = fo fg fay AW) P(x) dy dx = fg fg fxy7)h(y) Ox) dx dy 
= fo lg fA) Py) dx dy = fg [g FAW) P)P() dx dy 
= &(f)®(h). 


To make this computation rigorous, we put the appropriate inner products in place 
and use Fubini’s Theorem to justify the interchange of order of integration: 


(®(f *h)u, v’) 

=o Sof ay Oy), v’) dy dx= fg fg f ay” hO)(P(a)u, v’') dx dy 
= [elo f OAOMO@y)v, v') dx dy=fg [gf hW)(P(x) (yu, v') dx dy 
= [ole MAO), O(x)*v’) dx dy 

= [ole MAO), O(x)*v') dy dx = fg f(x)(P(A)v, ®(x)*v’) dx 

= fo FOO), v') dx = (@(f) Oh)», v’). 


This kind of computation translating a formal argument about ® (f) into a rigorous 
argument is one that we shall normally omit from now on. 

An important instance of a convolution f * h is the case that f and fh are 
characters of irreducible finite-dimensional representations. The formula in this 
case is 


none d-'x, if t St’ andd, is the degree of r, 
Wye tee 0 if t and 7’ are inequivalent. 
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This follows by expanding the characters in terms of matrix coefficients and 
computing the integrals using Schur orthogonality (Corollary 6.29). 

If f > 0 vanishes outside an open neighborhood WN of 1 that is a Gs in G and 
if {¢ f(x) dx = 1, then (®(f)v—v, v') = fg f(x)(®(~)v—v, v') dx. When 
|v’ || < 1, the Schwarz inequality therefore gives 


(@(f)v—v, v’)| < / Ff (x)||P@)v — v||lv'|| dx < Up |Px)u — vl. 
N XE 


Taking the supremum over v’ with ||v’|| < 1 allows us to conclude that 
[PC fv — v]] < sup || P(x)v — vI]. 
xeN 


We shall make use of this inequality shortly. 

An invariant subspace for a unitary representation ® on V is, just as in the 
finite-dimensional case, a vector subspace U such that ®(g)U C U forall g € G. 
This notion is useful mainly when U is a closed subspace. In any event if U is 
invariant, so is the closed orthogonal complement U+ since ut € U+ andu € U 
imply that 


(O(g)ut,u) = (u*, O(g)*u) = (ut, (g) |v) = (ut, O(g7! Ju) 


is in (u+,U) = 0. If V 4 0, the representation is irreducible if its only closed 
invariant subspaces are 0 and V. 

Two unitary representations of G, ® on V and ®’ on V’, are said to be 
equivalent if there is a bounded linear E : V — V’ with a bounded inverse 
such that ®’(g)E = E®(g) forall g € G. 


Theorem 6.34. If ® is a unitary representation of the compact group G on 
a complex Hilbert space V, then V is the orthogonal sum of finite-dimensional 
irreducible invariant subspaces. 


REMARK. The new content of the theorem is for the case that V is infinite 
dimensional. The theorem says that if one takes the union of orthonormal bases 
for each of certain finite-dimensional irreducible invariant subspaces, then the 
result is an orthonormal basis of V. 


PROOF. By Zorn’s Lemma, choose a maximal orthogonal set of finite- 
dimensional irreducible invariant subspaces, and let U be the closure of the sum. 
Arguing by contradiction, suppose that U is not all of V. Then U+ is a nonzero 
closed invariant subspace. Fix v 4 0 in U+. For each open neighborhood N of 1 
that is a Gs in G, let fy be the indicator function of N divided by the measure of 
N. Then fy is an integrable function > 0 with integral 1. It is immediate from 
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the definition of (®(fy)v, u) that ®( fy)v is in U+ for every N andeveryu € U. 
The inequality ||®(fy)v — v|| < sup,ey ||®(x)v — v|| and strong continuity of 
® show that ®( fy)v tends to v as N shrinks to {1}. Hence some ®(fy)v is not 
0. Fix such an N. 

Choose by the Peter-Weyl Theorem (Theorem 6.31) a function / in the lin- 
ear span of all matrix coefficients for all finite-dimensional irreducible unitary 
representations such that || fy — hl, < S®Cfy)vll /llvl. Then 


IPC fv)v — P(A)v|| = |OCfn — Aull < fw — All Moll 
< Wfv —Allgllull < 51@Cfy ull. 


Hence 
IP(A)v|] = |@Cfw)vll — |@Cfn)v — O(h)v]| = S| ®(fy) vl = 0, 


and ®(h)v is not 0. 

The function h lies in some finite-dimensional vector subspace S of L?(G) 
that is invariant under left translation. Let h;,...,h, be a basis of S, and write 
hj (g-!x) = pees cij(g)hj (x). The formal computation 


D(g)O(hj)v = O(g) [ge hj(a)O(x)udx = [, hj(x)O(gx)vdx 
= fe hj(g~!x)®(x)v dt= oy Gi) Pe h;(x)®(x)v dx 
= V1 cij(g) O(hi)v 


suggests that the vector subspace ia C®(h;)v, which is finite dimensional 
and lies in U+, is an invariant subspace for © containing the nonzero vector 
®(h)v. To justify the formal computation, we argue as in the proof of the formula 
®(f *«h) = ®(f)P(h), redoing the calculation with an inner product with 
v’ in place throughout. The existence of this subspace of U+ contradicts the 
maximality of U and proves the theorem. 


Corollary 6.35. Every irreducible unitary representation of a compact group 
is finite dimensional. 


PROOF. This is immediate from Theorem 6.34. 


Corollary 6.36. Let ® be a unitary representation of the compact group G on 
acomplex Hilbert space V . For each irreducible unitary representation t of G, let 
E, be the orthogonal projection on the sum of all irreducible invariant subspaces 
of V that are equivalent to t. Then E, is given by d,®(x,), where d, is the 
degree of t and x, is the character of t, and the image of F;, is the orthogonal 
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sum of irreducible invariant subspaces that are equivalent to t. Moreover, if t 
and t’ are inequivalent, then E,E, = E,,E, =0. Finally every v in V satisfies 


v=) Ba, 
ae 


with the sum an infinite sum over a set of representatives t of all equivalence 
classes of irreducible unitary representations of G and taken in the sense of 
convergence in the Hilbert space. 


REMARK. For each T, the projection E, is called the orthogonal projection on 
the isotypic subspace of type T. 


Proor. Let t be irreducible with degree d,, and put E’ = d,®(x,). Our 
formulas for characters and for operators ®(f) give us the two formulas 


ELE) = didy OX) OGK,) = dred OQ, *X,/) =O ifr ST, 
EV? = d?O(X, * X,) = dr ®(X,) = Et. 


The first of these says that E/E’, = E_,E. = O if t and 1’ are inequivalent, 
and the second says that E’ is a projection. In fact, E’ is self adjoint and is 
therefore an orthogonal projection. To see the self-adjointness, we let {u;} be an 
orthonormal basis of the vector space on which t operates by unitary transfor- 
mations. Then x, *(x) = x,(07!) =) (tj, ui) = YO Ui, Tuy) = 
>); T@)uj, ui) = x, (x). Therefore 


E,* = d,®(X,)* = dr P(X, *) = dr ®(x,) = EZ, 


and the projection FE, is an orthogonal projection. 

Let U be an irreducible finite-dimensional subspace of V on which |, 
is equivalent to t, and let u,,...,u, be an orthonormal basis of U. If we 
write ®(x)uj = ee @;;(x)u;, then ®j(x) = (P(x)uj,u;) and x,(x) = 
yoy, ii (x). Thus a formal computation with Schur orthogonality gives 


El uj = dr fg X_(x)P(x)uj dx = dr fg Doig Pee (x) Bij (x)ui dx = uj, 


and we can justify this computation by using inner products with v’ throughout. 
As aresult, we see that E’ is the identity on every irreducible subspace of type T. 

Now let us apply E’. to a Hilbert space orthogonal sum V = )> V, of the kind 
in Theorem 6.34. We have just seen that E’ is the identity on V, if Vy is of type 
t. If V, is of type t’ with t’ not equivalent to t, then E’, is the identity on Vy, 
and we have E7u = E!E’,u = 0 for all u € Vy. Consequently E’ is 0 on Vy, 
and we conclude that E’ = E,. This completes the proof. 


EXAMPLE. The right-regular representation r of G on L7(G). Let t be an 
abstract irreducible unitary representation of G, let (u1,..., un) be an ordered 
orthonormal basis of the space on which t acts, and form matrices relative to 
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this basis that realize each t(x). The formula is tj;(~) = (t(x)uj,u;). The 
computation (r(g)tij)(x) = tij(~g) = Dog Tie) (G) = Diy THs (8) Ti) 
shows that the matrix coefficients corresponding to a fixed row, those with i fixed 
and j varying, form an invariant subspace for r. The matrix of this representation 
is [t;;(g)], and thus the representation is irreducible of type t. Since these spaces 
are orthogonal to one another by Schur orthogonality, the dimension of the image 
of E, is at least d?. On the other hand, Corollary 6.32 says that such matrix 
coefficients relative to an orthonormal basis, as t varies through representatives of 
all equivalence classes of irreducible representations, form a maximal orthogonal 
system in L?(G). The coefficients corresponding to any t’ not equivalent to t 
are in the image of E,, and are not of type t. Therefore the orthogonal sum of the 
spaces of matrix coefficients for each fixed row equals the image of FE, and the 
dimension of the image equals d?. The corollary tells us that the formula for the 
projection is E, f = r(d:X.) f . To see what this is concretely, we use the defini- 
tions to compute that (FE, f,h) = (r(d-x,)f,h) = fe dex, (x)(r (x) f, h)dx = 


Solo dX OCMAONO)dydx = fo fgdrx@foxnhO)dydx = 
Neaecde he (x!) f(yx)hQy) dx dy = (f * d,x,,h). Therefore the orthogonal 
projection is given by E, f = f * d,x,. 


Corollary 6.36 is a useful result in taking advantage of symmetries in analysis 
problems. Imagine that the problem is to understand some linear operator on the 
space in question, and suppose that the space carries a representation of a compact 
group that commutes with the operator. This is exactly the situation with some 
of the examples of separation of variables in partial differential equations as in 
Section 1.2. The idea is that under mild assumptions, the operator carries each 
isotypic subspace to itself. Hence the problem gets reduced to an understanding 
of the linear operator on each of the isotypic subspaces. 

In order to have a concrete situation for purposes of illustration, let us assume 
that the linear operator is bounded, has domain the whole Hilbert space, and 
carries the space into itself. The following proposition then applies. 


Proposition 6.37. Let T : V — V be abounded linear operator on the Hilbert 
space V, and suppose that © is a unitary representation of the compact group G 
on V such that T®(g) = ®(g)T for all g in G. Let t be an abstract irreducible 
unitary representation of G, and let EF, be the orthogonal projection of V on the 
isotypic subspace of type t. Then TE, = E,T. 


PROOF. For v and v’ in V, (T E;v, v’) is equal to 
(E,v, T*v') =d, fee X,(x)(P(x)v, T*v’) dx = d, te X,(x)(T B(x)v, v’) dx 
= d: de X,(x)(®(x)T v, v') dx = (E,Tv, v’) dx, 


and the result follows. 
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EXAMPLE. The Fourier transform on L?(IR” ) commutes with each member p 
of the orthogonal group O(N) because if f has Fourier transform f ,then f (py) = 
POE Te =e f (x)ew2tP XY dx = San f (exe -"™” dx says 
that x +> f(x) has Fourier transform y t> f (py). Proposition 6.37 says that 
the Fourier transform carries each isotypic subspace of L?(R™) under O(N) into 
itself. Let us return to Example 5 in Section 6, in which we dealt with the vector 
space V; of all polynomials on R“” homogeneous of degree k. We saw that the 
vector subspace Hy of harmonic polynomials homogeneous of degree k is an 
invariant subspace under O(N). In fact, more is true. One can show that Hy is 
irreducible and that the Laplacian A carries V; onto |x|?V;_2. It follows from the 
latter fact that the space of restrictions to the unit sphere S‘—! of all polynomials 
is the same as the space of restrictions to S¥~! of all harmonic polynomials, 
with each irreducible representation H,; of O(N) occurring with multiplicity 1. 
Applying the Stone—Weierstrass Theorem on S‘~! and untangling matters, we 
find for L?(S‘~') that the isotypic subspaces under O(N) are the restrictions of 
the members of H;, each having multiplicity 1. Passing to L?(R¥) and thinking 
in terms of spherical coordinates, we see that each relevant t for L?(R") is the 
representation on some H, and that the image of E, is the space of L” functions 
that are finite linear combinations )°* j h, f; (|x|) of products of a member of Hy, 
and a function of |x|, the members of H; being linearly independent. According 
to the proposition, this image is carried to itself by the Fourier transform. The 
restriction of the Fourier transform to this image still commutes with members 
of O(N), and the idea is to use Schur’s Lemma (Corollary 6.27) to show that 
the Fourier transform has to send any h;(x) f (|x|) to h;(x)g(|x|); the details are 
carried out in Problem 14 at the end of the chapter. Thus we can see on the 
basis of general principles that the Fourier transform formula reduces to a single 
1-dimensional integral on each space corresponding to some Hy. Armed with this 
information, one can look for a specific integral formula, and the actual formula 
turns out to involve an integration and classical Bessel functions.!* 


CONCLUDING REMARKS. Proposition 6.37 and the above example are con- 
cerned with understanding a particular bounded linear operator, but realistic 
applications are more concerned with linear operators that are unbounded. For 
example, when the domain of a linear partial differential operator can be arranged 
in such a way that the operator is self adjoint and a compact group of symmetries 
operates, then one wants to exploit the symmetry group in order to express the 
space of all functions annihilated by the operator as the limit of the sum of those 
functions in an isotypic subspace. In mathematical physics the very hope that 
this kind of reduction is possible has itself been useful, even without knowing 
in advance the differential operator and the group of symmetries. The reason 


!2Bessel functions were defined in Section IV.8 of Basic. 
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is that numerical invariants of the compact group, such as the dimensions of 
some of the irreducible representations, appear in physical data. One can look 
for an appropriate group yielding those numerical invariants. This approach 
worked long ago in analyzing spin, it worked more recently in attempts to classify 
elementary particles, and it has been used still more recently in order to guess at 
the role of group theory in string theory. 


9. Problems 


1. Let G be a topological group. 

(a) Prove that the connected component of the identity element of G, i.e., the 
union of all connected sets containing the identity, is a closed subgroup that is 
group-theoretically normal. This subgroup is called the identity component 
of G. 

(b) Give an example of a topological group whose identity component is not 
open. 


2. The rotation group SO(N) acts continuously on the the unit sphere SY—! in R% 
by matrix multiplication. 
(a) Prove that the subgroup fixing the first standard basis vector is isomorphic 
to SON — 1). 
(b) Prove that the action by SO(N) is transitive on S%~! for N > 2. 
(c) Deduce that there is a homeomorphism SO(N)/SO(N — 1) > S%~! for 
N > 2 that respects the action by SO(N). 


3. Let G bea separable locally compact group, and suppose that G has a continuous 
transitive group action on a locally compact Hausdorff space X. Suppose that 
Xo is in X and that H is the (closed) subgroup of G fixing xo, so that there is a 
one-one continuous map z of G/H onto X. Using the Baire Category Theorem 
for locally compact Hausdorff spaces (Problem 3 of Chapter X of Basic), prove 
that z is an open map and that zr is therefore a homeomorphism. 


4. Let G, and G2 be separable locally compact groups, and let 7 : G; — G2 bea 
continuous one-one homomorphism onto. Prove that z is a homeomorphism. 
5. LetT* = {(e!’, e'%)}. The line R! acts on T* by 
(x, (ei, e'?)) i (eid tix eigtixv2 
Let p be the point (1, 1) of T* corresponding to 6 = y = 0. The mapping of R! 
into T? given by x +> xp is one-one. Is it a homeomorphism? Explain. 


6. Let G bea noncompact locally compact group, and let V be a bounded open set. 
By using the fact that G cannot be covered by finitely many left translates of V, 
prove that G must have infinite left Haar measure, i.e., that a Haar measure for 
a locally compact group can be finite only if the group is compact. 


10. 


11. 


12. 
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(a) Suppose that G is acompact group, A is a left Haar measure, p is a right Haar 
measure, and F is a Baire set. By evaluating deve Te(xy)d(p Xx A), y) 
as an iterated integral in each order, prove that A(E)o(G) = 4(G)p(E). 

(b) Deduce the uniqueness of Haar measure for compact groups, together with 
the unimodularity, from (a) and the existence of left and right Haar measures 
for the group. 


Suppose that {G,}°° , is a sequence of separable compact groups. Let GM = 


G| x --» x Gy, and let G be the direct product of all G,. Let un, w™, and yz be 

Haar measures on G,, G, and G, all normalized to have total measure 1. 

(a) Why is “” equal to the product measure uj X ++ * X Mn? 

(b) Show that “ defines a measure on a certain o-algebra of Borel sets of G 
that is consistent with j. 

(c) Show that the smallest o-algebra containing, for every n, the “certain 
o-algebra of Borel sets of G” as in (b), is the o-algebra of all Borel sets of 
G, so that yz can be regarded as the infinite product of 1, W2,.... 


Let G be a locally compact topological group with a left Haar measure d)x , and 
let ® be an automorphism of G as a topological group, i.e., an automorphism 
of the group structure that is also a homeomorphism of G. Prove that there is a 
positive constant a(®) such that d)(®(x)) = a(®) dix. 


Let G be a locally compact group with two closed unimodular subgroups S and 
T such that G = S x T topologically and such that T is group-theoretically 
normal. Write elements of G as st with s € S andt € T. Let ds and dt be Haar 
measures on S and T. Since t +> sts~! is an automorphism of T foreach s € S, 
the previous problem produces a constant 6(s) such that d (sts!) = 8(s) dt. 
(a) Prove that ds dt is a left Haar measure for G. 

(b) Prove that 6(s) ds dt is a right Haar measure for G. 


This problem leads to the same conclusion as Proposition 4.8, that any locally 

compact topological vector space over R is finite-dimensional, but it gives a more 

conceptual proof than the one in Chapter IV. Let V be such a space. For each 

real c # 0, let |c|y be the constant a(®) from Problem 9 when the measure is an 

additive Haar measure for V and ® is multiplication by c. Define |O|y = 0. 

(a) Prove that c +> |cly is acontinuous function from R into [0, +00) such that 
Ic1c2lv = |cilv|c2|y and such that |c1| < |c2| implies |ci|y < |czlv. 

(b) If W is a closed vector subspace of V, use Theorem 6.18 to prove that 
Icly = Iclwlelvyw- 

(c) Using (b), Proposition 4.5, Corollary 4.6, and the formula |c|pv = |c|%, 
prove that V has to be finite-dimensional. 

Let ® be a finite-dimensional unitary representation of a compact group G ona 


finite-dimensional inner-product space V. The members of the dual V* are of 
the form £, = (-, v) with v in V, by virtue of the Riesz Representation Theorem 
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13: 


14. 


15: 


16. 


17. 
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for Hilbert spaces. Define (£,,, £1.) = (v2, v1). Prove that the result is the inner 
product on V* giving rise to the Banach-space norm on V%*, and prove that the 
contragredient representation ®° has ®°(x)£, = €4(,), and is unitary in this 
inner product. 


Let ® and ©’ be two irreducible unitary representations of a compact group 
G on the same finite-dimensional vector space V, and suppose that they are 
equivalent in the sense that there is some linear invertible EF : V —> V with 
E®(g) = ®'(g)E for all g € G. Prove that ® and ®’ are unitarily equivalent in 
the sense that this equality for some invertible E implies this equality for some 
unitary EF. 


This problem seeks to fill in the argument concerning Schur’s Lemma in the 

example near the end of Section 8. Introduce an inner product in the space 

H; of harmonic polynomials on R“ homogeneous of degree k to make the 

representation of O(N) on Hy be unitary, and let {h;} be an orthonormal basis. 

The representation ® on H, and its corresponding matrices [®(p);;] are given by 

(B(p)hj)(x) = h; (p-!x) = >; ®(e)i;4i (x). Let F be the Fourier transform on 

IRN , and fix a function f (|x|) such that |x|* f (|x|) is in L?(R%). Define a matrix 

F (lyl) = Lfij(IyI)] for each |y| by F(a (x) f(xD)() = DA) fis (yD)- 

(a) Assuming that the functions f and F are continuous functions of |x|, prove 
that F (ly )I®(p)i] = [0 (o)j]F (yl) for all p. 

(b) Deduce from (a) and Corollary 6.27 that F(h(x) f (|x|)) is of the form 
h(y)g(\y|) if A is in Hy and the continuity hypothesis is satisfied. 

(c) Show how the continuity hypothesis can be dropped in the above argument. 


Making use of the result of Problem 12, show that the matrix coefficients of the 
contragredient ®° of a finite-dimensional representation ® of a compact group 
are the complex conjugates of those of ® and the characters satisfy xg. = Xo- 


Anexample in Section 8 examined the right-regular representation r of acompact 
group G, given by (r(g) f)(x) = f (xg), and showed that the linear span of the 
matrix coefficients of an irreducible t equals the whole isotypic space of type 
T, a decomposition of this space into irreducible representations being given by 
the decomposition into rows. Show similarly for the left-regular representation 
1, given by (l(g) f) (x) = (ex), that the linear span of the matrix coefficients 
of the irreducible t equals the whole isotypic space of type T°, a decomposition 
of this space into irreducible representations being given by the decomposition 
into columns. 


Let G be a compact group, and let V be a complex Hilbert space. 

(a) ForG = S!, prove that the left-regular representation / of G on L?(G) is 
not continuous in the operator norm topology, i.e., that g +> /(g) is not 
continuous from G into the Banach space of bounded linear operators on 
L?(G). 


18. 


19. 


20. 


(b) 
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Suppose that g +> ®(g)isahomomorphism of G into unitary operators on V 
that is weakly continuous, i.c., that has the property that g tH (®(g)u, v) 
is continuous for each u and v in V. Prove that g +> ®(g) is strongly 
continuous in the sense that g + ®(g)v is continuous for each v in V, ie., 
that ® is a unitary representation. 


Let G be a compact group. 


(a) 


(b 


wm 


(c) 
(d) 


Let ® be an irreducible unitary representation of G, and let f be a linear 
combination of matrix coefficients of the contragredient ®° of ©. Prove that 
fC) =dTr ®(f), where d is the degree of f. 

Let {6} be a maximal set of mutually inequivalent irreducible unitary 
representations of G, and let d™ be the degree of ®. Prove that each 
trigonometric polynomial f on G satisfies the Fourier inversion formula 
fQ) = 4 d@ Tr 6 (f), the sum being a finite sum in the case of a 
trigonometric polynomial. 

Deduce the Plancherel formula for trigonometric polynomials on G from (b). 
If G is a finite group, prove that every complex-valued function on G is a 
trigonometric polynomial. 


Let G be a compact group. 


(a) 


(b 


ma 


(c) 


(d 


wm 


Prove that if / is any member of C(G) such that h(gxg—!) = h(x) for every 
gand x inG,thenh* f = f *h for every f in L'(G). 

Prove that if f is a trigonometric polynomial, then x +> cf (gxg—!) dg is 
a linear combination of characters of irreducible representations. 

Using the Approximation Theorem, prove that any member of C(G) such 
that h(gxg—!) = h(x) for every g and x in G is the uniform limit of a 
sequence of linear combinations of irreducible characters. 

Prove that the irreducible characters form an orthonormal basis of the 
closed vector subspace of all members h of L?(G) satisfying h(x) = 
ta h(gxg') dg almost everywhere. 


Let G be a finite group, let {@} be a maximal set of inequivalent irreducible 
representations of G, and let d“ be the degree of 6. 


(a) 
(b) 


(c) 


Prove that )>, (d‘)? equals the number of elements in G. 

Using (d) in the previous problem, prove that the number of &“’s equals the 
number of conjugacy classes of G, i.e., the number of equivalence classes of 
G under the equivalence relation that x ~ y if x = gyg~! for some g € G. 
In a symmetric group G,, two elements are conjugate if and only if they 
have the same cycle structure. In G4, two of the irreducible representations 
are 1-dimensional. Using this information and the above facts, determine 
how many ®’s there are for G4 and what degrees they have. 


Problems 21-22 concern Theorem 6.16, its hypotheses, and related ideas. In the 
theory of (separable) “Lie groups,” if S and T are closed subgroups of a Lie group G 
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whose intersection is discrete and the sum of whose dimensions equals the dimension 
of G, then multiplication S x T — G is an open map. These problems deduce 
this open mapping property in a different way without any knowledge of Lie groups, 
and then they apply the result to give two explicit formulas for the Haar measure of 
SL(2, R) in terms of measures on subgroups. 


21. Let G bea separable locally compact group, and let S and T be closed subgroups 
such that the image of multiplication as a map S x T — G is an open set in G. 
Using the result of Problem 3, prove that S x T — G is an open map. 


22. For the group G = SL(2,R), let K = {ko = Ce ee re os EA, 


sin@  cosé a 


A={a=(92.)}, y= {ay =($7)}, anav = {1 =(79)f. 


(a) Prove that AN is aclosed subgroup and that every element of G is uniquely 
the product of an element of K and anelement of AN. Using Theorem 6.16, 
show that the formula 


(fy = iiss ae irae f (keaxny)e* dy dx dé 


defines a translation-invariant linear functional on Ceom(G). 
(b 


wm 


Prove that MAN is a closed subgroup and that every element (e :) of G 
with a ¥ 0, and no other element of G, is a product of an element of V and 
an element of MAN. Assume that the subset of elements ) of G with 
a = O has Haar measure 0. Using Theorem 6.16, show that the formula 


€f) = nse Spa—o0 Sra—o0 Iya—oo f (Usmsaxny)e* dy dx dv 


defines a translation-invariant linear functional on Ccom(G). 


Problems 23-27 do some analysis on the group G = SU(2) of 2-by-2 unitary 
matrices of determinant 1. Following the notation introduced in Example 4 in 
Section 6 and in its continuation later in that section, let ®, be the representation 
of G on the homogeneous holomorphic polynomials of degree n in z; and z2 given 


by (®n(g)P) ee =P (s! i) Let T = {to}, with ty = diag(e”, e~) , be the 
diagonal subgroup. The text calculated that the character x,, of ®, is given on T by 


ind 4 gi(n—2)0 agg. eg e gteDe 
X, (te) = Tr Py (ts) = e'”? +e eee peri = 


eid — eid 

Take for granted that ®, is irreducible for each n > 0. 

23. Take as known from linear algebra that every member of SU(2) is of the form 
gtag | for some g € SU(2) and some @. Show that the only ambiguity in fg is 
between 6 and —@. Prove that the linear mapping of C(G) to C(T) carrying f in 
C(G) to the function tg > ie f (gteg—') dg has image all functions g € C(T) 
with g(t_o) = v(t). 
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24. Reinterpret the image in the previous problem as all continuous functions on the 
quotient space T/{1, w}, where y : T — T interchanges t_» and tg. Why is this 
space compact Hausdorff? Why then can it be identified with [0, 2]? 


25. Prove that there is a Borel measure jz on [0, 27] such that 


Je fo) dx = fron Se fetes ') dg du) 

for all f in C(G). 

26. Follow these steps to identify dj.(@) in the previous problem and thereby have 
a formula for integrating over G = SU(2) by first integrating over conjugacy 
classes. Such a formula can be obtained by computations with coordinates and 
use of the change-of-variables formula for multiple integrals, but the method here 
is shorter. 

(a) Using the orthogonality relations Go Xn (x) X9(x) dx = 6,0, prove that 
Sion d.(@) = 1 and that Sto] (e'*? + e—'*) du(6) is —1 fork = 2 but is 0 
fork = 1landk >3. 

Extend ju to [—z, ] by setting it equal to 0 on [—7, 0), define jz’ on [—7, zr] 

by w(E) = 5(W(E) + u(—E)), observe that yu’ is even, and check that 

te cos n@ dy’(@) is equal to 1 for n = 0, to —1 for n = 2, and to O for 
n=landn>3. 

(c) Deduce that the periodic extension of yu’ from (—z, 7] to R is given by its 

Fourier—Stieltjes series dj’ (0) = a ( — cos 20) dé. 

(Special case of Weyl integration formula) Conclude that 


Sg fax = +? [JG f(gtzog™') dg] sin’ odo. 


(b 


wm 


(d 


wm 


27. Prove that every irreducible unitary representation of SU(2) is equivalent to 
some ®,,. 


Problems 28-32 concern locally compact topological fields. Each such is of interest 
from the point of view of the present chapter because its additive group is a locally 
compact abelian group and its nonzero elements form another locally compact abelian 
group under multiplication. A topological field is a field with a Hausdorff topology 
such that addition, negation, multiplication, and inversion are continuous. The fields 
Rand C are examples. Another example is the field Q, of p-adic numbers, where p is 
aprime. To construct this field, one defines on the rationals Q a function | - |, by setting 
|O|, = O and taking |p"r/s|, equal to p~” ifr and s are relatively prime integers. 
Then d(x, y) = |x —y|p isa metric on Q, and the metric space completion is Q,. The 
function | - |, extends continuously to Q, and is called the p-adic norm. It satisfies 
something better than the triangle inequality, namely |x + y|, < max{|x|p, |y|p}; this 
is called the ultrametric inequality. Problems 27-31 of Chapter II of Basic show 
that the arithmetic operations on Q extend continuously to Q, and that Q, becomes 
a topological field such that |xy|, = |x|p|ylp. Because of the ultrametric inequality 
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the subset Z, of Q, with |x|, < 1 is a commutative ring with identity; it is called 
the ring of p-adic integers. It is a topological ring in that its addition, negation, and 
multiplication are continuous. Moreover, it is compact because every closed bounded 
subset of Q, can be shown to be compact. The subset / of Z, with |x|, < p' is the 
unique maximal ideal of Z,, and the quotient Z,/J is a field of p elements. 


28. Prove that every compact topological field is finite. 


29. Let F be a locally compact topological field, and let F* be the group of nonzero 
elements, the group operation being multiplication. 

(a) Let c be in F%, and define |c|- to be the constant a(®) from Problem 9 
when the measure is an additive Haar measure and © is multiplication by c. 
Define |0|- = 0. Prove that c b> |c|r is a continuous function from F into 
[0, +00) such that |cyc2|r = |cilrlealr. 

If dx is a Haar measure for F as an additive locally compact group, prove 

that dx /|x|r is a Haar measure for F™* as a multiplicative locally compact 

group. 

(c) Let F = R be the locally compact field of real numbers. Compute the 
function x +> |x|. Do the same thing for the locally compact field F = C 
of complex numbers. 

(d) Let F = Q, be the locally compact field of p-adic numbers, where p is a 
prime. Compute the function x b> |x|r. 

(e) For the field F = Q, of p-adic numbers, suppose that the ring Z, of p-adic 
integers has additive Haar measure 1. What is the additive Haar measure of 
the maximal ideal J of Z,? 


(b 


wm 


30. Consider Q, as a locally compact abelian group under addition. 
(a) Prove from the continuity that any multiplicative character of the additive 
group Q, is trivial on some subgroup p”Z, for sufficiently large n. 
(b) Tell how to define a multiplicative character go of the additive group Q, in 
such a way that go is 1 on Zp and go(p—!) = e27!/?, 
(c) If gis any multiplicative character of the additive group Q,, prove that there 
exists a unique element k of Q, such that g(x) = go(kx) for all x in Qp. 


31. Let P = {co} U {primes}. For v in P, let Q, be the field of p-adic numbers if v 
isa prime p, or Rif v = oo. For v in P, define | - |, on Q, as follows: this is to 
be the p-adic norm on Q, if v is a prime p, and it is to be the ordinary absolute 
value on R if v = oo. Each member of the rationals Q can be regarded as a 
member of Q, for each v in P. Prove that each rational number x has |x|, 4 1 
for only finitely many v. 


32. (Artin product formula) For each nonzero rational number x, the fact that 
|x|, ~ 1 for only finitely many v in P shows that J], |x|, is a well-defined 
rational number. Prove that actually [], |x|, = 1. 
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Problems 33-38 concern the ring Ag of adeles of the rationals Q and the group of 
ideles defined in terms of it. These objects are important tools in algebraic number 
theory, and they provide interesting examples of locally compact abelian groups. Part 
of the idea behind them is to study number-theoretic questions about the integers, such 
as the solving of Diophantine equations or the factorization of monic polynomials 
with integer coefficients, by first studying congruences. One studies a congruence 
modulo each power of any prime, as well as any limitations imposed by treating 
the coefficients as real. The ring Ag of adeles of Q is a structure that incorporates 
simultaneously information about all congruences modulo each prime power, together 
with information about R. Its definition makes use of the construction of direct limits 
of topological spaces as in Problems 26-30 in Chapter IV, as well as the material 
concerning p-adic numbers in Problems 29-32 above. 


33. The construction of restricted direct products in Problem 30 at the end of Chap- 
ter IV assumed that J is a nonempty index set, So is a finite subset, X; is a locally 
compact Hausdorff space X; for eachi € J, and K; is acompact open subset of 
X; for eachi ¢ So. As in that problem, for each finite subset S of 7 containing 
So, let 

X(S) = (X pg) ie (X ght) 


giving it the product topology. Suppose that each X;, fori € J, is in fact a locally 
compact group and K;, fori ¢ So, is a compact open subgroup of X;. Prove 
that each X (S), with coordinate-by-coordinate operations, is a locally compact 
group and that the direct limit X acquires the structure of a locally compact group. 
Prove also that if each X; is a locally compact topological ring and each K; isa 
compact subring, then each X (S) is a locally compact topological ring and so is 
the direct limit X. 


34. In the construction of the previous problem, let 7 = P = {oo} U {primes} 
and Sg = {oo}, and form the restricted direct product of the various topo- 
logical fields Q, for v € P with respect to the compact open subrings Z,. 
The above constructions lead to locally compact commutative rings Ag(S) for 
each finite subset S of P containing So, and the direct limit Ag is the locally 
compact commutative topological ring of adeles for Q. Show that each Ag(S) is 
an open subring of Ag. Show that we can regard elements of Ag as tuples 
X = (Xoo, X2,X3,X5,---,Xy,---) = (Xy)vep in which all but finitely many 
coordinates x, are in Zp. 


35. For each rational number x, the fact that |x|, < 1 for all but finitely many v 
allows us to regard the tuple (x, x, x,...) as a member of Ag. Thus we may 
regard Q, embedded “diagonally,” as a subfield of the ring Ag. Prove that Q is 
discrete, hence closed. 


36. In the setting of the previous problem, prove that Ag/Q is compact. 
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37. For the rings Q,, Z,, and Ag, let Q*, Z*, and AS be the groups consisting of the 
members of the rings whose multiplicative inverses are in the rings. Give Q* and 
Z* the relative topology. In the case of Ao, define the topology as a restricted 
direct product of the locally compact groups Q* for v € P with respect to the 
compact open subgroups Z*. The locally compact group AS is called the group 
of ideles of Q. Show that the set-theoretic inclusion of Ao into Ag is continuous 
but is not a homeomorphism of Ag with its image. 


38. This problem constructs Haar measure on the ring Ag considered as an additive 

group. As in Problem 34, S denotes any finite subset of P containing {oo}. 

(a) Fix S. This part of the problem constructs Haar measure on Ag(S). For 
each prime p in S, define Haar measure jz, on Q, to be normalized so that 
()(Z,) = 1. Forma measure js on Ag(S) as follows: On the product X (S) 
of R and the Q, for p prime in S, use the product of Lebesgue measure and 
[ty. On the product Y(S) of all Z, for p ¢ S, use the Haar measure on 
the infinite product of the Z,’s obtained as in Problem 8. Then Ag(S) = 
X(S) x Y(S). Show that Haar measure zs on Ag(S) may be taken as the 
product of these measures on X (S) and Y(S) and that the resulting measures 
are consistent as S varies. 

Show that each measure jus defines a set function on a certain o-subalgebra 

B(S) of Borel sets of Ag that is the restriction to B(S) of a Haar measure on 

all Borel subsets of Ag. 

(c) Show that the smallest o-algebra for Ag containing, for every finite S con- 
taining {co}, the o-algebra B(S) as in (b) is the o-algebra of all Borel sets 
of Ag ci 


(b 


wm 


Problems 39-47 concern almost periodic functions on topological groups. Let G be 
any topological group. Define a bounded continuous function f : G — C to be 
left almost periodic if every sequence of left translates of f, i.e., every sequence of 
the form {gn f} with (gn f(x) = f(g, 'y), has a uniformly convergent subsequence; 
equivalently the condition is that the closure in the uniform norm of the set of left 
translates of f is compact. Define right almost periodic functions similarly; it will 
turn out that left almost periodic and right almost periodic imply each other. Take for 
granted that the set of left almost periodic functions, call it LAP(G), is a uniformly 
closed algebra stable under conjugation and containing the constants. Application of 
the Stone Representation Theorem (Theorem 4.15) to LAP(G) produces a compact 
Hausdorff space S$, a continuous map p : G +> Sj; with dense image, and a norm- 
preserving algebra isomorphism of LAP(G) onto C(S,). The space Sj is called the 
Bohr compactification of G. These problems show that S; has the structure of a 
compact group and that the map of G into S; is a continuous group homomorphism. 
Application of the Peter-Weyl Theorem to S$; will give a Fourier analysis of LAP(G) 
and an approximation property for its members in terms of finite-dimensional unitary 
representations of G. 


39. 


40. 


41. 


42. 


43. 


44, 
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Suppose that K is a compact group and that. : G — K is a continuous 

homomorphism. 

(a) Prove that every member of C(K) is left almost periodic and right almost 
periodic on K. 

(b) If F isin C(K), let f be the function on G defined by f(x) = F(t(x)) for 
x € G. Prove that f is left almost periodic and right almost periodic on G. 


Let ® be a finite-dimensional unitary representation of G, and let f be a matrix 
coefficient of ©. Prove that f is left almost periodic and right almost periodic. 


Let f be left almost periodic on G, let Ly be the subset of C(G) consisting of 

the left translates of f, and let Ky be the closure in C(G) of Ly. The set Ky is 

compact by definition of left almost periodicity. 

(a) Prove that f is left uniformly continuous in the sense that for any € > 0, 
there is a neighborhood U of {1} such that |lgf — fllsup < € for all g in U. 

(b) Each member of the group G acts on Ly with go(gf) = (gog) f. Prove that 
this operation of go on Ly is an isometry of L, onto itself. 

(c) Prove that the operation of each go on Ly extends uniquely to an isometry 
t¢(go) of Ky onto itself. 


Let X be a compact metric space with metric d, and let T’ be the group of 

isometries of X onto itself. Make I into a metric space (I, 9) by defining 

P(P1, $2) = SUP, ey A(G1(X), Y2(X)). 

(a) Prove that I is compact as a metric space. 

(b) Prove that I’ is a topological group in this topology, hence a compact group. 

(c) Prove that the group action T x X — X given by (y,x)  y(x) is 
continuous. 


Let I'y be the isometry group of Ky, and consider I"y as a compact metric space 

with metric as in the previous problem. 

(a) Prove that the mapping t¢ : G — I'¢ defined in Problem 41c is continuous. 

(b) Prove that if h is in Ky, then the definition Fr(h)(y) = (y~th)(1) for 
y €T; yields a continuous function on I’ such that h(g9) = Fy (h)(¢¢(g0)).- 

(c) Conclude from the foregoing that f is right almost periodic and hence that left 
almost periodic functions can now be considered as simply almost periodic. 


For each almost periodic function f on G, let 1 : G — I be the continu- 
ous homomorphism discussed in Problems 41c and 43a. Let T = Il; Ty be 
the product of the compact groups Ir, and define :(g) = Il; t¢(g), So that 
t: G — T is acontinuous homomorphism. Problem 39b shows that if F’ is in 
C(L), then the function / defined on G by h(x) = F((x)) is almost periodic. 
Prove that every almost periodic function on G arises in this way from some 
continuous F on this particular I’. 
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45. 


46. 


47. 


VI. Compact and Locally Compact Groups 


Let K be the closure of 1(G) in the compact group I in the previous problem, let 
S; be the Bohr compactification of G, and let p : G — S; be the continuous map 
defined by evaluations at the points of G. Prove that there is a homeomorphism 
® : S$; — K such that ® o p =1, so that the construction of K can be regarded 
as imposing a compatible group structure on the Bohr compactification of G. 


Apply the Approximation Theorem to prove that every almost periodic function 
on G can be approximated uniformly by linear combinations of matrix coeffi- 
cients of finite-dimensional unitary representations of G. 


Suppose that G is abelian, and let p : G — K be the continuous homomorphism 
of G into its Bohr compactification. Prove that the continuous multiplicative 
characters of G coincide with the continuous multiplicative characters of K under 
an identification by p. (Educational note: It is known from “Pontryagin duality” 
that if the group K of continuous multiplicative characters of the compact abelian 
group K is given the discrete topology, then K is isomorphic to the compact group 
of multiplicative characters of K , the topology on this character group being the 
relative topology as a subset of the unit ball of the dual of C(K) in the weak- 
star topology. Thus K may be obtained by forming the group of continuous 
multiplicative characters of G, imposing the discrete topology, and forming the 
group of multiplicative characters of the result.) 


CHAPTER VII 


Aspects of Partial Differential Equations 


Abstract. This chapter provides an introduction to partial differential equations, particularly linear 
ones, beyond the material on separation of variables in Chapter I. 

Sections 1-2 give an overview. Section | addresses the question of how many side conditions 
to impose in order to get local existence and uniqueness of solutions at the same time. The 
Cauchy—Kovalevskaya Theorem is stated precisely for first-order systems in standard form and 
for single equations of order greater than one. When the system or single equation is linear with 
constant coefficients and entire holomorphic data, the local holomorphic solutions extend to global 
holomorphic solutions. Section 2 comments on some tools that are used in the subject, particularly 
for linear equations, and it gives some definitions and establishes notation. 

Section 3 establishes the basic theorem that a constant-coefficient linear partial differential 
equation Lu = f has local solutions, the technique being multiple Fourier series. 

Section 4 proves a maximum principle for solutions of second-order linear elliptic equations 
Lu = 0 with continuous real-valued coefficients under the assumption that L(1) = 0. 

Section 5 proves that any linear elliptic equation Lu = f with constant coefficients has a 
“parametrix,” and it shows how to deduce from the existence of the parametrix the fact that the 
solutions uv are as regular as the data f. The section also deduces a global existence theorem when f 
is compactly supported; this result uses the existence of the parametrix and the constant-coefficient 
version of the Cauchy—Kovalevskaya Theorem. 

Section 6 gives a brief introduction to pseudodifferential operators, concentrating on what is 
needed to obtain a parametrix for any linear elliptic equation with smooth variable coefficients. 


1. Introduction via Cauchy Data 


The subject of partial differential equations is a huge and diverse one, and a 
short introduction necessarily requires choices. The subject has its origins in 
physics and nowadays has applications that include physics, differential geometry, 
algebraic geometry, and probability theory. A small amount of complex-variable 
theory will be extremely helpful, and this will be taken as known for this chapter. 
We shall ultimately concentrate on single equations, as opposed to systems, and on 
partial differential equations that are linear. After the first two sections the topics 
of this chapter will largely be ones that can be approached through a combination 
of functional analysis and Fourier analysis. 
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Let us for now use subscript notation for partial derivatives, as in Section I.1. 
A system of p partial differential equations in N variables for the unknown 


functions u™, ...,u” consists of p expressions 
1 I I 1 
PU ped SN aca ce pate ln es saa yikes Oy 


1 < k < p, in an open set of R; it is assumed that the partial derivatives 
that appear as variables have bounded order. When p = 1, we speak of simply 
a partial differential equation. The highest order of a partial derivative that 
appears is the order of the equation or system. We might expect that it would be 
helpful if the number p of equations in a system equals the number m of unknown 
functions, but one does not insist on this condition as a matter of definition. A 
system in which the number p of equations equals the number m of unknown 
functions is said to be “determined,” but nothing is to be read into this terminology 
without a theorem. We shall work only with determined systems. The equation 
or system is linear homogeneous if each Fy; is a linear function of its variables. 
It is linear if each F; is the sum of a linear function and a function of the N 
domain variables that is taken as known. 

The classical equations that we would like to include in a more general theory 
are the three studied in Section I.2 in connection with the method of separation 
of variables —the heat equation, the Laplace equation, and the wave equation— 
and one other, namely the Cauchy—Riemann equations. With A denoting the 
Laplacian Au = uy,y, +-+++Uxyxy, the first three of these equations in N space 
variables are 


u; = Au, Au = 0, and Uy = Au. 
The Cauchy—Riemann equations are ordinarily written as a system 
Ux = Vy, Uy = —vy, 


but they can be written also as a single equation if we think of u and v as real and 
write f = u-+ iv. Then the system is equivalent to the single equation 
veer or f;=0, where Be ena 
Oz Oz Ox dy 
Guided in part by the theory of ordinary differential equations of Chapter IV in 
Basic, we shall be interested in existence-uniqueness questions for our equation 
or system, both local and global, and in qualitative properties of solutions, such 
as regularity, the propagation of singularities, and any special features. For a 
particular equation or system we might be interested in any of the following three 
problems: 
(i) to find one or more particular solutions, 
(ii) to find all solutions, 
(iii) to find those solutions meeting some initial or boundary conditions. 
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Problems of the third type as known as boundary-value problems or initial- 
value problems.! The method of separation of variables in Section I.2 is partic- 
ularly adapted to solving this kind of problem in special situations. 

For ordinary differential equations and systems these three problems are closely 
related, as we saw in the course of investigating existence and uniqueness in 
Chapter IV of Basic. For partial differential equations they turn out to be 
comparatively distinct. We can, however, use the kind of setup with first-order 
systems of ordinary differential equations to get an idea how much flexibility 
there is for the solutions to the system. Let us treat one of the variables x 
as distinguished” and suppose, in analogy with what happened in the case of 
ordinary differential equations, that the system consists of an expression for the 
derivative with respect to x of each of the unknown functions in terms of the 
variables, the unknown functions, and the other first partial derivatives of the 
functions. Writing down general formulas involves complicated notation that 
may obscure the simple things that happen; thus let us suppose concretely that 
the independent variables are x, y and that the unknown functions are u, v. The 
system is then to be 


ux = F(x, y; Uu, U,Uy, Vy), 


vy = G(x, y,U, VU, Uy, Vy). 


With x still regarded as special, let us suppose that u and v are known when 
x = 0,i.e., that 


uO, y) = f(y), 
v0, y) = g(y). 


The real-variable approach of Chapter IV of Basic is not very transparent for this 
situation; an approach via power series looks much easier to apply. Thus we 
assume whatever smoothness is necessary, and we look for formal power series 
solutions in x, y. The question is then whether we can determine all the partial 
derivatives of all orders of u and v at a point like (0, 0). Itis enough to see that the 
system and the initial conditions determine eu, y) and ev, y) for all k > 0. 
For k = 0, the initial conditions give the values. For k = 1, we substitute x = 0 
into the system itself and get values, provided we know values of all the variables 
at (0, y). The values of u and v come from k = 0, and the values of u, and vy 


'The distinction between these terms has nothing to do with the mathematics and instead is a 
question of whether all variables are regarded as space variables or one variable is to be interpreted 
as a time variable. 

It is natural to think of this variable as representing time and to say that the differential equation 
and any conditions imposed at a particular value of this variable constitute an initial-value problem. 
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come from differentiating those expressions with respect to y. For k = 2, we 
differentiate each equation of the system with respect to x and then put x = 0. For 
each equation we get a sum of partial derivatives of F’, evaluated as before, times 
the partial of each variable with respect to x. For the latter we need expressions 
for Ux, Uy, Uxy, and vyy; we have them since we know u, (0, y) and v, (0, y) from 
the step k = 1. This handles k = 2. For higher k, we can proceed inductively by 
continuing to differentiate the given system, but let us skip the details. The result 
is that the initial values of u(O, y) and v(O, y) are enough to determine unique 
formal power-series solutions satisfying those initial values. 

Next, under the hypothesis that F, G, f, and g are holomorphic functions of 
their variables near an initial point, one can prove convergence of the resulting 
two-variable power series near (0,0). This fact persists when the number of 
equations and the number of unknown functions are increased but remain equal, 
and when the domain variables are arbitrary in number. The theorem is as follows. 


Theorem 7.1 (Cauchy—Kovalevskaya Theorem, first form). Let a system of 


p partial differential equations with p unknown functions u“,...,u) and N 
variables x,,..., x, of the form 
1 1 1 1 
vie = Fu, ...,u, re ws ul, 25 a ; ul?) ; 
(*) 
(P) — qd) (p) (1) (p) () (p) 
ae a 2 (/ ered ak A eee) AGEN CARER i ; 


be given, subject to the initial conditions 


BNO 2605 oan) Si oan Ns 
(*) 

u) (0, x2,...,XN) = Fp Oi 
Suppose that f1, ..., f are holomorphic in a neighborhood in C’~! of the point 
(X2,...,Xy) = Cae mazet Xn) and that fF, ..., F’, are holomorphic in a neighbor- 
hood in C’? of the value of the argument uM oo, ul? of the F;’s that corresponds 
to (0, xs .. eee Then there exists a neighborhood of (41, .x2,...,xn) = 
(0, xs 5 ibaa ten) in C% in which the system (*) has a holomorphic solution satis- 
fying the initial conditions (**). Moreover, on any connected subneighborhood 
of (0, xy See eas there is no other holomorphic solution satisfying the initial 


conditions. 
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We omit the proof since we shall use the theorem in this generality only as a 
guide for how much in the way of initial conditions needs to be imposed to expect 
uniqueness without compromising existence. Initial conditions of the form (+) 
for a system of equations () are called Cauchy data. 

We shall, however, make use of a special case of Theorem 7.1, where a better 
conclusion is available. 


Theorem 7.2. In the Cauchy—Kovalevskaya system of Theorem 7.1, suppose 
that the functions F; in the system (*) are of the form 


1 1 1 
Fru, 2. uu el eee ee ane 


9 XQ 9° 
P . P N . 
— Yau” + om aul + hy (x4, tus Soy Xn) 


i=l i=l j=2 


with the a; and c;; constant and with each h; a given entire holomorphic function 
on CN, Suppose further that the functions fj(x2, ..., x) in the initial conditions 
(**) are entire holomorphic functions on C’. Then the system (*) has an entire 
holomorphic solution satisfying the initial conditions (+*). 


This theorem is proved in Problems 6-9 at the end of the chapter without 
making use of Theorem 7.1. We shall use it in proving Theorem 7.4 below, 
which in turn will be applied in Section 5. 

Since our interest is really in single equations and we want to allow order > 1, 
we can ask whether we can carry over to partial differential equations the familiar 
device for ordinary differential equations of introducing new unknown functions 
to change a higher-order equation to a first-order system. 

Recall with an ordinary differential equation of order n for an unknown function 


y(t) when the equation is y” = F(t, y,y’,...,y“~?): we can introduce 
unknown functions y,,..., yy satisfying y) = y, y2 = y’,...,¥n = yO, 
and we obtain an equivalent first-order system y; = yo,.-., )) = Yn> 


y) = F(t, y1, Y2,--+, Yn). Values for y, y’,..., y"~) at t = fo correspond to 
values at t = fo for y,, y2,..., Y, and give us equivalent initial-value problems. 

For a single higher-order partial differential equation of order m in which the 
m" derivative of the unknown function with respect to one of the variables x 
is equal to a function of everything else, the same kind of procedure changes a 
suitable initial-value problem into an initial-value problem for a first-order system 
as above. But if we ignore the initial values, the solutions of the single equation 
need not match the solutions of the system. Let us see what happens for a single 
second-order equation in two variables x, y for an unknown function u under the 
assumption that we have solved for u,,. Thus consider the equation 


Uxy = F(x, Y,U,Ux, Uy, Uxy, Uyy) 
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with initial data 


u(0, y) = fy), 
ux(O, y) = gy). 


This is another instance in which the initial data are known as Cauchy data: 
the equation has order m, and we are given the values of u and its derivatives 
through order m — 1 with respect to x at the points of the domain where x = 
0. For this example, introduce variables u, p,q,r,s,t equal, respectively, to 
U, Ux, Uy, Uxx, Uxy, Uyy. With these interpretations of the variables, the given 
equation becomes r = F(x, y,u, p,g,5,t), and we differentiate this identity 
to make it more convenient to use. Then wu yields a solution of a system of six 
first-order equations, namely 


ux = Pp, 

Px =!, 

dx = Py, 

PS eh pi PE Sige le el, 
Sx =Ty, 

ty = Sy. 


The choice here of gy = py rather than g, = s is important; we will not be able 
to invert the initial-value problem without it. The initial data will be values of 
u, P,q,Vr,5,t at (O, y), and we can read off what we must use from the above 
values of u(0, y) and ux (0, y), namely 


u(0, y) = f(y), 

pO, y) = g8(y), 

gO, y) = f'(y), 

r0,y) = FO, y, £0), 80), £0), 80), f"0)), 
s(,y) = 8'(y), 

t(0, y) = f"(). 


If u satisfies the initial-value problem for the single equation, then the definitions 
of u, p,q,7r,5,t give us a solution of the initial-value problem for the system. 

Let us show that a solution u, p,g,r,5,t of the initial-value problem for the 
system has to make u be a solution of the initial-value problem for the single 
equation. What needs to be shown is that vy = ¢,Uxy = s, and uyy = t. We use 
the same kind of argument with all three. 
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For uy = q, we see from the system that (uy), = (Ux)y = Py = qx, so that 
(uy —q)x = 0. Therefore u,(x, y)—q(x, y) = A(y) for some function h. Setting 
x = 0 gives h(y) = u,(0, y) — gy) =f’) — f’(y) = 0. Thus h(y) = 0, 
and we obtain uy = q. 

Similarly foru,, = s,westart from uy.y = Pry =y = Sy,Sothat (Uyy—s)y = 
0. Therefore uxy(x, y) — s(x, y) = k(y) for some function k. Setting x = 0 
gives k(y) = uxy(0, y) — sO, y) = pyO, y) — sO, y) = gy) — 8’) = O. 
Thus k(y) = 0, and we obtain ux, = s. 

Finally foruy, = t, we start fromuyyy = (Uxy)y = Sy = t,,sothat (uy,—t), = 
0. Therefore uy, (x, y) —t(x, y) =/(y) for some function /. Setting x = 0 gives 
I(y) = uyy, y) — tO, y) = f"(y) — f"() = 0. Thus /(y) = 0, and we obtain 
jy =o. 

The conclusion is that the given second-order equation with two initial con- 
ditions is equivalent to the system of six first-order equations with six initial 
conditions. In other words the Cauchy data for the single equation lead to Cauchy 
data for an equivalent first-order system. It turns out that if a single equation of 
order m has one unknown function and is written as solved for the m™ derivative 
of one of the variables x, and if the given Cauchy data consist of the values at 
x = Xo of the unknown function and its derivatives through order m — 1, then 
the equation can always be converted in this way into an equivalent first-order 
system with given Cauchy data. The steps of the reduction to Theorem 7.1 are 
carried out in Problems 10-11 at the end of the chapter. The result is as follows. 


Theorem 7.3 (Cauchy—Kovalevskaya Theorem, second form). Let a single 
partial differential equation of order m in the variables (x, y) = (x, y1,.--, Yw—1) 
of the form 


Diu = F(x, y;u; all DED&u with k < m and k + |a| < m) (x) 
be given, subject to the initial conditions 
Diu, y) = f(y) for0 <i <m. (x0) 


Here a is assumed to be a multi-index a = (a1, ..., @y—1) corresponding to the 
y variables. Suppose that f,..., f~" are holomorphic in a neighborhood 
in C’—! of the point (1,...,¥v—1) = (y?, a ee and that F’ is holomor- 
phic in a neighborhood of the value of its argument corresponding to x = 0 
and (yj,.--;Yn-1) = (y?, eure Ve ais Then there exists a neighborhood of 
(x, y1,---, Yn-1) = (0, oe ater yee 4:) in C% in which the system (+) has a 
holomorphic solution satisfying the initial conditions (**«). Moreover, on any 
connected subneighborhood of (0, iy sete oe there is no other holomorphic 
solution satisfying the initial conditions. 
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In the special case that F is the sum of a known entire holomorphic function and 
a linear combination with constant coefficients of x, y, and the various Dk Dyu, 
the steps that reduce Theorem 7.3 to Theorem 7.1 perform a reduction to Theorem 
7.2. We therefore obtain a better conclusion under these hypotheses, as follows. 


Theorem 7.4. Let a single partial differential equation of order m in the 
variables (x, y) = (x, y1,..-, Yy—1) of the form 


Du =ax+biyi+:: -+by-1yn-1+ >, Ck, Di Do u+h(x, Yis-++, Yn-1) (%) 


O<k<m 
k+|a|<m 


be given, subject to the initial conditions 
Diu, y) = f(y) for0 <i <m. (0) 


Suppose that f O., f (™—) are entire holomorphic on C4! and that h is entire 
holomorphic on C%. Then the equation (+) has an entire holomorphic solution 
satisfying the initial conditions («-). 


The steps in the reduction of this theorem to Theorem 7.2 are indicated for 
N = 2 in Problem 11 at the end of the chapter, and the steps for general N 
are similar. We shall make use of Theorem 7.4 to prove the existence of certain 
“fundamental solutions” in Section 5. 

As we said, in this reduction from an initial-value problem for a single equation 
to an initial-value problem for a first-order system, the equation without initial 
values is not always equivalent to the system without initial values. A simple 
example will suffice. In the second-order setup as above, let the given equation 
be ux,y = —Uyy + 4. That is, let F(x, y, u, ux, Uy, Uxy, Uyy) = —Uyy +4. This 
equation has u = x* + y? as a solution, for example. If we introduce variables 
u, P,q,1,5,t as above, we find that F(x, y,u, p,g,s,t) = —t +4, and we 
obtain the system 


ux =P, 
Px =!, 
dx = Py, 
ry =F, + pF, +rF,4 SF, try FP; 4 Sy FP, = —Sy, 
=: 
ty = Sy 


If we put 
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2 is not a 


we find that this tuple (u, p,g,r,s,t) solves the system. But u = x 
solution of uy, = —uyy + 4. 

There is a still more general Cauchy—Kovalevskaya Theorem than anything we 
have considered, still involving local holomorphic systems, data, and solutions. 
It amounts to whatever one can get by combining the Implicit Function Theorem, 
the technique of reduction of order via an increase in the number of equations, 
and Theorem 7.1. We omit the precise statement. The word “noncharacteristic” 
is used to describe situations in which the Implicit Function Theorem applies for 
this purpose. 

Cauchy data are not the only kinds of initial data that one might consider. 
In fact, none of the examples with separation of variables in Section I.2 used 
Cauchy data. A typical example from that section is the Dirichlet problem for 
the Laplacian in the unit disk. The equation can be written as uy, = —uyy, and 
Cauchy data would consist of values of u(xo, y) and u,(xo, y). This amounts to 
two functions on a piece of a line in the plane, and one could handle two functions 
of a suitable curve in the plane after applying the Implicit Function Theorem. By 
contrast, the Dirichlet problem requires just a single function on the unit circle fora 
unique solution. A more apt comparison is to think of a Sturm—Liouville problem 
as being an ordinary-differential-equations analog of the Dirichlet problem. A 
particular Sturm—Liouville problem to compare with the Dirichlet problem for 
the disk is the equation u,,. = 0 with boundary conditions u(0) = u(z) = 0. 
The region is a ball in 1-dimensional space, and the function is specified on the 
boundary; the function is uniquely determined without specifying the derivative 
on the boundary. However, if the equation is changed to u,, = —Au for some 
positive constant 1, then there is a nonunique solution when A is the square of a 
nonzero integer. 


2. Orientation 


After this essay on what is appropriate for existence and uniqueness, let us turn to 
some other aspects of partial differential equations and systems. A few principles 
and observations will influence what we do in the upcoming sections of this 
chapter. 


The subjects of linear systems and nonlinear systems of partial differential 
equations cannot be completely separated. 

For example let a(x, y) and b(x, y) be given functions on an open set in R?, 
and consider the single linear equation 


a(x, yuy + d(x, y)uy =0 
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for an unknown function u(x, y). If we look for curves c(t) = (x(t), y(t)) 
along which such a function u(x, y) is constant, the condition on c is that 
(4)u(x(t), y@)) = 0, hence that 


x'(t)ux (x(t), y(t) + y'()uy(X@), yO) = 0. 


One way for this equation to be satisfied is that c(t) = (x(¢), y(t)) satisfy the 
system 


x 0) = 0,9): 
y(t) = b(x, y), 


of two ordinary differential equations. This system is nonlinear, and the condition 
for c(t) to solve it is that c(t) be an integral curve. Thus u is a solution if it is 
constant along each integral curve. If we introduce two parameters, one varying 
along an integral curve and the other indexing a family of integral curves, then 
we obtain solutions by letting u be any function of the second parameter. Under 
reasonable assumptions, these solutions turn out to be the only solutions locally, 
and thus the solution of a certain linear partial differential equation reduces to 
solving a nonlinear system in fewer variables. Despite this circumstance the 
partial differential equations of interest to us will be the linear ones. 


As we have seen, there is a distinction between the reduction of a partial 
differential equation to a first-order system of Cauchy type and the reduction of 
a Cauchy problem for the equation to the corresponding Cauchy problem for the 
first-order system. 

One consequence is that finding a several-parameter set of solutions of a partial 
differential equation may not be very helpful in solving a specific boundary- 
value problem about the equation. With an eye on the wave equation, let us take 
as an example a homogeneous linear equation with constant coefficients. Let 
P: RAT, C bea polynomial such as P (xo, %1,...,XN) = x6 —Xp—: . —x%, 
in the case of the wave equation, xo being the time variable. We write the equation 
in our notation with D as 

P(D)u =0, 

understanding as usual that 0 iE dx; is to be substituted in P everywhere that x; 
appears. If a is any (N + 1)-tuple, then (a/ax;)er* = aje“*. Consequently 
P(D)e** = P(aje**, and e** solves the equation P(D)u = 0 whenever P(a) = 
0. Concretely with the wave equation, let a be areal number, let 6B = (f1,..., By) 
be in RY , and write x = (t, x’). Then e”-F *" solves the wave equation whenever 
a* = |B|?. Apart from the one constraint w? = ||*, we obtain an N-parameter 
family of solutions of the wave equation. But this family of solutions is not of 
any obvious help in solving boundary-value problems such as those encountered 
in Section I.2. We shall discuss this example further shortly. 
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Global problems involving linear partial differential equations with constant 
coefficients lend themselves to use of the Fourier transform. 

The reason is that the Fourier transform carries differentiation into multipli- 
cation by a function. Specifically under suitable conditions on /f, the relevant 
formula is FSE)E ) = 2Wié;(Ff)(&) if we use € for the Fourier transform 
variable. 

Thus, at least on a formal level, to find a solution of an inhomogeneous 
equation P(D)u = f, we can take the Fourier transform of both sides, obtaining 
P(Qmié)(Fu)(€) = (Ff)(E). Then we divide by P(27ié) and take the inverse 
Fourier transform. In Section II.1 we carried out the steps of this process for 
the equation (1 — A)u = f when f is in the Schwartz space. In this case the 
polynomial is 1 +47r?|é|*, and we found that there is a solution u in the Schwartz 
space. 

In practice the function P(27ié) may be zero in some places, and then we 
have to check what happens with the division. There will also be a matter of 
ensuring that the inverse Fourier transform is well defined where we want it to 
be. 

In Section 3 we shall use multiple Fourier series to see that a linear equation 
P(D)u = f with constant coefficients and with f in C ae aad ) always has a 
solution in a neighborhood of a point. It is of interest also to know what happens 
when /f is replaced by a function with fewer derivatives or even by a distribution 
of compact support. This matter is addressed in Problem 5 at the end of the 
chapter. 


For a linear partial differential equation of order m, the terms with differen- 
tiations of total order m are especially important. Moreover, a linear equation 
with variable coefficients can sometimes be studied near a point xo of the domain 
by applying a “freezing principle.” 

We explain the notion of a freezing principle in a moment. We shall now make 
use of the notation of Chapter V for linear differential operators L, often writing 
an equation under study as Lu = f with f known and u unknown. Here L is 
given by 

L=P(x,D)= ys a(x) D® 


|a|<m 


for some m, or we can write 


b= PE Dy= Yo aC? 


|a|<m 


if the variable x of differentiation needs emphasis. It is customary to assume that 
m is the order of L,in which case some ay (x) with |a| = m is not identically zero. 
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The domain is to be an open set in real Euclidean space, usually IR“; thus x varies 
in that open set, and the multi-index @ is an N-tuple of nonnegative integers. 

The idea of a freezing principle is that the behavior of solutions of P(x, D)u = 
f near x = xo can sometimes be studied by considering solutions of the equa- 
tion (P (xo, D,)u)(x) = f(x) and making estimates for how much effect the 
variability of x might have. For equations that are “elliptic” in a sense that 
we define shortly, the classical approach to the equations via something called 
“Garding’s inequality” used this idea and worked well. We shall indicate a more 
recent approach via “pseudodifferential operators” in Section 6 and will omit any 
discussion of details concerning Garding’s inequality in our development. The 
freezing principle is somewhat concealed within the mechanism of pseudodif- 
ferential operators, but it is at least visible in the notation that is used for such 
operators. 

As far as theorems for nonelliptic operators are concerned, the idea of a 
freezing principle is meaningful but has its limitations. We have noted that linear 
differential equations with constant coefficients are at least locally solvable, a 
result that will be proved in Section 3. But the same is not always true for 
equations with variable coefficients. In 1957 Hans Lewy gave an example in R? 
involving the linear differential operator 


P(x, D) = —(D, +1D2) + 21% + 1x2) D3. 


For a certain function f of class C® that is nowhere real analytic, the equation 
P(x, D)u = f admits no solution in any nonempty open set. By contrast, if f 
is holomorphic, the Cauchy—Kovalevskaya Theorem (Theorem 7.3) ensures the 
existence of local solutions. 

In the linear differential operator P(x, D,) = ar <m Ga (x) DY, the terms of 
highest order are of special interest; we group them and give them their own 
name: 

POD) =>. aa@)De. 


ja|=m 


In line with the freezing principle, when one takes a Fourier transform, one 
does not apply the Fourier transform to the coefficients of L, only to the various 
D®’s. Recalling that D% goes into multiplication by (27i)'“'&% under the Fourier 
transform, we introduce the expressions* 


3The Fourier transform variable é lies in the dual space of R“ . To take maximum advantage of 
this fact in more advanced treatments, one wants to identify R" with the tangent space at x to the 
domain open set. Then & is to be regarded as a member of the dual of the tangent space of x, and 
to some extent, the formalism makes sense on smooth manifolds. We elaborate on these remarks in 
Chapter VIII. 
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P(x, 2nié) = )) dg(x)(2mié)* 


|a|<m 


and Py (a, 20i€) = S~ dg (x)(Qmié)". 


lja|=m 
These are called the symbol and the principal symbol of L, respectively. 


EXAMPLES. The Laplacian, the wave operator, and the heat operator have order 
m = 2, while the Cauchy—Riemann operator has m = 1. In all these cases except 
the heat operator, the symbol and the principal symbol coincide. The operators 
written with the notation D are 


A=A, = D?+.-.-+ D2, in RY (Laplacian), 
= = D,+iD, (Cauchy—Riemann operator), 

= Ds — A, in RN*! (wave operator), 

Do — Ax in RN*! (heat operator). 


The principal symbols P,, (x, 277i€) in each case are independent of x and are as 
follows: 


An? (EP + +» + EH) (Laplacian), 
2mié —2m& (Cauchy—Riemann operator), 
—4n7&> + 47 (EP + +++ + EH) (wave operator), 
An? (E? 4.0. + 2) (heat operator). 


Complex analysis inevitably plays an important role in the study of partial 
differential equations. 

We already saw that complex analysis is useful in addressing the Cauchy 
problem. The Lewy example shows that complex analysis has to play a role in 
drawing a distinction between linear equations with constant coefficients, where 
we always have local existence of solutions, and linear equations with variable 
coefficients, where local existence can fail if the inhomogeneous term of the 
equation is merely C®. Actually, the complex analysis that enters the local 
existence theorem in Section 3 for linear equations with constant coefficients 
is rather primitive and can be absorbed into facts about polynomials in several 
variables. Complex analysis enters in a more serious way for more advanced 
theorems about partial differential equations, but we shall not pursue theorems 
that go in this direction beyond one application in Section 5 of Theorem 7.4. 
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Linear partial differential equations can exhibit behavior of kinds not seen in 
ordinary differential equations. 

The operator L onan open set in R" is said to be elliptic at x if P(x, 27i€) = 0 
for € € R% only when & = 0. The operator L is elliptic if it is elliptic at every 
point x of its domain. The Laplacian and the Cauchy—Riemann operator are 
elliptic, but the wave operator and the heat operator are not. A linear ordinary 
differential operator with nonvanishing coefficient for the highest-order derivative 
is automatically elliptic. We shall be especially interested in elliptic operators, 
which are relatively easy to handle. 

In Section I.2 we considered the Dirichlet problem for the unit disk in R7, 
namely the problem of finding a function uw satisfying Au = 0 in the interior 
and taking prescribed values on the boundary. The problem was solved by the 
Poisson integral formula. No matter how rough the function on the boundary 
was, the solution u in the interior was a smooth function. Theorem 3.16 extended 
this conclusion of smoothness, showing that solutions of Au = 0 in any open 
set of R™ are automatically C~. This behavior is typical of solutions of linear 
elliptic differential equations with smooth coefficients. 

Other partial differential equations can behave quite differently. Consider the 
wave equation ((2)° — A,y)u = 0 with x € R”. We have seen that u(t, x) = 
e“'—P-* is a solution if a is a number and f isa vector with a” = |A|*. But actually 
the exponential function is not important here. If f is any C* function of one 
variable, then f (at — 6 -x) is asolution as long as a* = ||’ is satisfied: in fact, 
((2)° — Ax) f(at—B-x) = f"(at —B-x)(a? —|B|*). Such a solution represents 
an undistorted progressing wave; the roughness of the wave is maintained as time 
progresses. Again, this kind of behavior is not exhibited by elliptic equations. 

In the special case that L is of order 2 with real coefficients and a point xo is 
specified, we can make a linear change of variables in € to bring the order-two 
terms of the operator into a certain standard form at xo that makes the question of 
ellipticity transparent. This change of variables amounts to replacing the standard 
basis e1, ..., ey used for determining the first partial derivatives Dj,..., Dy bya 
new basis e/,, ... , ey and the corresponding first partial derivatives D), ..., Diy. 
The result is as follows. 


Proposition 7.5. If L = P(x, D) is of order 2 and has real coefficients in 
an open set of R™ and if a point xo is specified, then there exists a nonsingular 
N-by-N real matrix M = [M;;] such that the definition D; = 0, Mix Dy exhibits 
L at xo as of the form «Di? + --- + ky Dy” with each x; equal to +1, —1, or 0. 
The principal symbol of L at xo is then —42r? yi Ki ie where & = )0, Mjx&k- 


REMARKS. We see immediately that L is elliptic at xo if and only if all x; 
are +1 or all are —1. This is the situation with the Laplacian. In Section 4 we 
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shall prove a maximum principle for certain elliptic operators of order 2 with 
real coefficients, generalizing the corresponding result for the Laplacian given in 
Corollary 3.20. If one «; is +1 and the others are —1, or if one is —1 and the others 
are +1, the operator is said to be hyperbolic at xo; this is the situation with the 
wave operator. Much is known about hyperbolic operators of this kind and about 
generalizations of them, but the study of such operators remains a continuing 
subject of investigation. 


Lemma 7.6 (Principal Axis Theorem). If B is a real symmetric matrix, then 
there exist a nonsingular real matrix M and a diagonal matrix C whose diagonal 
entries are each +1, —1, or 0 such that B = M"CM. 


PROOF. By the finite-dimensional Spectral Theorem for self-adjoint operators, 
choose an orthogonal matrix P such that PB P~! is some real diagonal matrix E. 
Any real number is the product of a square and one of +1, —1, and 0, and thus 
E = QCQ with C as in the lemma and with Q = Q" diagonal and nonsingular. 
Since P is orthogonal, P~! = P", and therefore B = P“Q"C QP. This proves 
the lemma with M = QP. 


PROOF OF PROPOSITION 7.5. Let the principal symbol be 


Po(x, 2niE) = D> ag(x)(2nik)* = —40” Y° ag (x)E*. 


|a|=2 |o|=2 


We rewrite this in matrix notation, viewing € = (&,..., &y) as a column vector 
and converting {dy (x)} into a matrix by defining 


bjj(X) =ag(x) — if'@ is 2 in the j a entry and 0 elsewhere, 
big (x) = 5 dq (x) if ais 1 in the j™ and k™ entries and 0 elsewhere. 


Then B(x) = [bjx(x)] is a symmetric matrix, and 


Po(x, 20iE) = —4? Y~ djx (x )EjEK = —AM7E"B(X)E. 
ik 


We apply the lemma to the real symmetric matrix B = B(xo) to obtain B(xo) = 
M"C(xo)M with M nonsingular and with C(x) diagonal of the form in the 
lemma. Define C(x) by B(x) = M"C(x)M, write C(x) = [cjx(x)] and 
M = [mj], and put &' = Mé. Then Py(x,27ié) = —A4n7E"B(x)E = 
—4An7E™(M"C(x)M)E = —4n78'"C(x)E’. If we set Di = Yo, MjxDx, then 
the algebraic manipulations for the order-two part of L are the same as with 
the principal symbol and show that the order-two part of the operator is given 
by P(x, D) = ik bjg(x) Dj De = ae Cjx(X)D; Dy. The matrix C(x) is 
diagonal with diagonal entries +1, —1, and 0, and the proposition follows. 
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Ways are needed for making routine the passage via the Fourier transform 
between differentiations and multiplications by polynomials. 

We are going to be using the Fourier transform to transform any linear equation 
Lu = f ,at least in the constant-coefficient case, into a problem involving division 
by a polynomial and inversion of a Fourier transform. It is inconvenient to check 
repeatedly the technical conditions in Proposition 8.1 of Basic that relate differen- 
tiations and multiplications by polynomials. Weak derivatives and Sobolev spaces 
as discussed in Chapter III, and distributions as discussed in Chapter V, all help 
us handle easily the passage via the Fourier transform between differentiations 
and multiplications by polynomials. 


“Fundamental solutions” are useful for obtaining all solutions of a linear 
partial differential equation, especially for constant-coefficient equations. In the 
case of an elliptic equation, a substitute for a fundamental solution that is easier 
to find is a “parametrix,” which at least reveals qualitative properties of solutions. 

In Section I.3 we encountered Green’s functions in connection with Sturm— 
Liouville theory. The operator L under study in that section was a second-order 
ordinary differential operator, and a Green’s function was the kernel of an integral 
operator 7; that we used. To understand symbolically what was happening there, 
let us take r = 1 in Section I.3, and then the operator T , which is the same as the 
operator 7; forr = 1 in that section, sets up a one-one correspondence between a 
class of functions u and a class of functions /f, the relationship being that u = T f 
and Lu = f. In other words T was a two-sided inverse of L. The operator T 
was of the form Tf (x) = vb G(x, y) f(y) dy. If we think symbolically of taking 
f to be a point mass 6,, at xo, then we find that T(6,,)(x) = G(x, xo), and the 
relationship is to be L(G(- , x9)) = 4x,. In other words the Green’s function at xo 
is a fundamental solution u of the equation Lu = f in the sense that application 
of L to it yields a point mass at xo. 

These matters can easily be made rigorous with distributions of the kind intro- 
duced in Chapter V. In the case that L has constant coefficients, the notion of a 
fundamental solution is especially useful because the operator L commutes with 
translations. If a certain u produces Lu = 60, then translation of that u by some 
Xo produces a solution of Lu = 6,,. In short, one obtains a fundamental solution 
for each point by finding it just for one point, and all solutions may be regarded 
as the sum of a weighted average of fundamental solutions at the various points 
plus a solution of Lu = 0. In practice we can carry out this process of weighted 
average by means of convolution of distributions. Corollary 5.23 carried out the 
details for the Laplacian in R% , once Theorem 5.22 had identified a fundamental 
solution at 0. 

In the case of the Laplacian in all of R“ , Theorem 5.22 showed that a funda- 
mental solution at 0 is a multiple of |x|~““— if N > 2. But fundamental solutions 
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are at best inconvenient to obtain for other equations, and a certain amount of 
the qualitative information they yield, at least in the elliptic case, can be obtained 
more easily from a “parametrix,” which is a kind of approximate fundamental 
solution. To illustrate matters, consider the inhomogeneous version Au = f of 
the Laplace equation, which is known as Poisson’s equation. Suppose that / 
is inC oR ) and we seek information about a possible solution u. We shall 
use the Fourier transform, and therefore u had better be a function or distribution 
whose Fourier transform is well defined. But let us leave aside the question of 
what kind of function u is, going ahead with the computation. If we take the 
Fourier transform of both sides, we are led to ask whether the following inverse 
Fourier transform is meaningful: 


—477? [ es |E|~? f (E) dé. 


Here fe ) is in the Schwartz space, but the singularity of |&|~? at the origin does 
not put |& 2 F(E ) into any evident space of Fourier transforms. To compensate, 
we use Proposition 3.5f to introduce a function x € CS, (R) that is identically 0 
near the origin and is identically 1 away from the origin. Then x (&)|& \-2 F(E) 
has no singularity and is in fact in the Schwartz space. It thus makes sense to 


define 
Of (x) = —4n? [ EME? FEM, 


where Qf (x) is the Schwartz function with 


OF (E) = —4n? x (E)IEI °F). 
Since Af is in C&. (IR) and Of is a Schwartz function, QAf and AQF are 


com 
Schwartz functions. Applying the Fourier transform operator F, as it is defined 


on the Schwartz space, we calculate that 
FQAf) = xf = FAQf). 
Hence FAS — f) = FAQS - fy =(x- DF. 


The function x — | on the right side is in C&(R%), and it is therefore the 
Fourier transform of some Schwartz function K. Since F carries convolutions 


into products, we have K f = K * f, and consequently 
QA = AQ = 14 (convolution by K). 


The operator of convolution by K is called a “smoothing operator” because, 
as follows from the development of Chapter V, it carries arbitrary distributions of 


292 VI. Aspects of Partial Differential Equations 


compact support into smooth functions. The operator Q that gives a two-sided 
inverse for A except for the smoothing term is called a parametrix for A. 

The parametrix does not solve our equation for us, but it does supply useful 
information. As we shall see in Section 5, a parametrix will enable us to see that 
whenever u is a distribution solution of Au = f on an open set U, with f an 
arbitrary distribution on U, then u is smooth wherever f is smooth. In particular, 
any distribution solution of Au = 0 is a smooth function. The argument will 
apply to any elliptic linear partial differential equation with constant coefficients. 
A first application of the method of pseudodifferential operators in Section 6 
shows that the same conclusion is valid for any elliptic linear partial differential 
equation with smooth variable coefficients. 


3. Local Solvability in the Constant-Coefficient Case 


We come to the local existence of solutions to linear partial differential equations 
with constant coefficients. 


Theorem 7.7. Let U be an open set in RY containing 0, and let f be inC©(U). 
If P(D) is a linear differential operator with constant coefficients and with order 
> 1, then the equation P(D)u = f has a smooth solution in a neighborhood of 0. 


The proof will use multiple Fourier series as in Section III.7. Apart from that, 
all that we need will be some manipulations with polynomials in several variables 
and an integration. As in Section III.7, let us write Z’ for the set of all integer 
N-tuples and [—z, z]’ for the region of integration defining the Fourier series. 

We shall give the idea of the proof, state a lemma, prove the theorem from 
the lemma, and then return to the proof of the lemma. The idea of the proof of 
Theorem 7.7 is as follows: We begin by multiplying f by a smooth function 
that is identically 1 near the origin and is identically 0 off some small ball 
containing the origin (existence of the smooth function by Proposition 3.5f), 
so that f is smooth of compact support, the support lying well inside [—z, 2]. 
If we regard f as extended periodically to a smooth function, we can write 
f(x) = Yopegn dee'** by Proposition 3.30e. Let the unknown function w be 
given by u(x) = )opezn cpe’**. Then P(D)u(x) is given by 


P(D)u(x) = D> cP Gke**, 

keZN 
4 
PUk 
than any |k|~”, by Proposition 3.30c and our computations. So we would like to 
prove that 


and thus we want to take c, P(ik) = d,. We are done if decreases faster 


|PGik)|-' <CA4+|kl)” ~— forallk € Z® 
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and for some constants C and M, and then we would be done. Unfortunately this 
is not necessarily true; the polynomial P(x) = |x|? is a counterexample. What is 
true is the statement in the following lemma, and we can readily adjust the above 
idea to prove the theorem from this lemma. 


Lemma 7.8. If R(x) is any complex-valued polynomial not identically 0 on 
IR , then there exist a € R% and constants C and M such that 


IR(kK+a)|-1<CUA+|k/?)” — forallk eZ. 


PROOF OF THEOREM 7.7. Apply the lemma to R(x) = P(ix). Because of the 
preliminary step of multiplying f by something, we are assuming that f is smooth 
and has support near 0. Instead of extending f to be periodic, as suggested in 
the discussion before the lemma, we extend the function f (x)e7'** to be smooth 
and periodic. Thus write 


f@er?* _ » det, 


keZN 


and put c,h, = Since the |d,| decrease faster than |k|~" for any n, 


d 
Reta)’ 
Lemma 7.8 and Proposition 3.30c together show that )0,-7n cxe 
and periodic. Define 


u(x) = gitx y sage = y epee 


keZN keZN 


‘kX ig smooth 


This function is smooth but maybe is not periodic. Application of P(D) gives 


P(D)u(x) = D> cePGK +a) et 


keZN 
a, dx ik: 
= ginx ss P(i(k +a))e** 
RE +e) 
= eiex Ss dpeik* = ere) = f(x), 
keZN 


and hence u solves the equation for the original f in a neighborhood of the origin. 
The proof of Lemma 7.8 requires two lemmas of its own. 


Lemma 7.9. For each positive integer m and positive number 6 < 1, there 
exists a constant C such that 


1 7 z 
Jo, eal Vries l ada = GC 


for any m complex numbers ¢1,..., Cm. 
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PROOF. For 1 < j < m, let E; be the subset of [—1, 1] where |x — cil? is the 
largest of the m factors in the integrand. The integral in question is then 


—6 =§ 
Serie |x —cy|~° +++ |x — Cm |~° dx 
m —méd m 1 —ms 
s Bee Sr, |x —¢;| a Dai ae |x — ¢;| me dx 
1 —més 1 te 
< PR ASE sr |x —Recj|"™ dx <m sup,cp f_, |x —r|"™ dx. 


On the right side the integrand decreases pointwise with |r| when |r| > 1, and 
hence the expression is equal to 


m SUP_j<;<1 ie jx —r|-™ dx 
=M SUp_j<;<] obi (r— x)7ms dx + ss (x — 2 ies dx) 
= m(1 — md)" sup_j<,<) (+r) + (1 —r)!-™) 


22M nl ms). 


Lemma 7.10. If R(x) is any complex-valued polynomial on R™ of degree 
m > 0, then |R(x)|~° is locally integrable whenever 5 < x. 


PROOF. We first treat the special case that xj” has coefficient 1 in R(x) and that 
integrability on the cube [—1, 1] is to be checked. Write x’ for (x2,...,xy), 
so that x = (x1, x’). Then R(x) = xf" + ear x] pj(x"), where each p; is a 
polynomial. For fixed x’, R(x;, x’) is a monic polynomial of degree m in x; and 
factors as (x; —C,) +--+ (1 —Cm) for some complex numbers c, ..., Cm depending 
on x’. Applying Lemma 7.9, we see that he |R(x1, x')|~° dx, < C. Integration 
in the remaining N — | variables therefore gives Seay |R(x)| 2 dx < 2N-!c, 

Turning to the general case, suppose that R(x) and a point xo are given. We 
want to see that F(x) = R(x + xo) has the property that |F(x)|~° is integrable 
on some neighborhood of the origin in R” . The function F is still a polynomial 
of degree m. Let F;, be the sum of all its terms of total degree m. This cannot be 
identically 0 on the unit sphere since it is a nonzero homogeneous function,* and 
thus F,,(v,;) 4 0 for some unit vector v;. Extend {v,} to an orthonormal basis 
of R", and define G(y,,..., yy) = Fn(y1v; +--+ yyuy). The function G is 
a polynomial of degree m whose coefficient of yj" is Fi, (v;) and hence is not 0, 
and it is obtained by applying an orthogonal transformation to the variables of 
F.. Therefore |G|~* and |F|~> have the same integral over a ball centered at the 
origin. The special case shows that |G|~° is integrable over some such ball, and 
hence so is |F|~°. 


4A function Fi, of several variables is homogeneous of degree m if Fy, (rx) =r" Fin (x) for all 
r >Oandallx 40. 
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PROOF OF LEMMA 7.8. Let R have degree m, which we may assume is positive 


without loss of generality. The function S(x) = |x|?" ( ee) is then a polynomial 


of degree < 2m, and Lemma 7.10 shows that any number 6 with 6 < oT has 
the property that |R|~° and |S|~° are integrable for |x| < 1. Using spherical 
coordinates and making the change of variables r +> 1/r in the radial direction, 
we see that 


Sic [R(x)|8|xl-2% dx = [2 focgw-1 IRP@)|-8r 2% dar N—! dr 
= Ir=o fregn-1 RO)? dar 8! dr 
= Sixist [R(x /|x|?)|-? dx 
= Sixtet SG dx 
= fa SQ? ax. 


The right side is finite. Since (1 + |x|?)~" < 1+ |x|~?%, we see that 
Saw IRF + [x P)-% dx < 00. 


Define E = {a € R% | 0 < a; < 1 forall j}. By complete additivity, we can 
rewrite the above finiteness condition as 


ew [ orezn IR(K +.@)| (1 + [k +7)" ] da < 00. 
Every pair (/, 8) with! € Zand B € (0, 1) has (+ 6)” < 2(1+/°). Summing N 


such inequalities gives |k + a|* < 2N 4+ 2|k|? < 2N(1 + |k|*). Thus we obtain 
b+ kHal?<3NC0+[e?), G+ ik +e)" = GN) YC +k), and 


beer [een IR(k+a)|° + Ik|?7)~%] da <o. 


Therefore ez |R(k +a)|~°(1+|k|?)~% is finite almost everywhere [da]. Fix 
an a for which the sum is finite. If 


Dregy IRA + @)| PC + [kN = K < 00, 


then |R(k + a)|~9(1 + |k|?)~-" < K forall k € Z and hence |R(k + a)|~! < 
K'/5(1 4+ |k|?)%/* for all k € Z. This proves Lemma 7.8. 
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4. Maximum Principle in the Elliptic Second-Order Case 


In this section we work with a second-order linear homogeneous elliptic equation 
Lu = 0 with continuous real-valued coefficients in a bounded connected open 
subset U of R%. It will be assumed that only derivatives of u, and not u itself, 
appear in the equation; in other words we assume that L(1) = 0. The conclusion 
will be that a real-valued C? solution u cannot have an absolute maximum or 
an absolute minimum inside U without being constant. This result was proved 
already in Corollary 3.20 for the special case that L is the Laplacian A. 

Let us use notation for L of the kind in Proposition 7.5 and its proof. Then L 
is of the form 

Lu = > bij(x)DjDju + > cx(x) Deu 
i,j k 

with the matrix [b;;(x)] real-valued and symmetric. Ellipticity of L at x means 
that >; , bij (@)EE;) A 0 for € # 0. Thus | >>; ; bij(x)éis)| has a positive 
minimum value (x) on the compact set where |&| = 1. By homogeneity of 
| ii bij (x)EE;| and |€|?, we conclude that 


do bus] = MODE? 
i,j 
for some (x) > 0 and all €. The positive number jz(x) is called the modulus 
of ellipticity of L at x. 


EXAMPLE. Let L be the sum of the Laplacian and first-order terms, i.e., 
Lu = Au+ )°, cx(x)D,u. Suppose that u is a real-valued C? function on U 
and that u attains a local maximum at xp in U. By calculus, Dju(xo) = 0 for 
each i and D?u (xo) < 0, so that Lu(xo) < 0. Therefore if we know that Lu(x) 
is > 0 everywhere in U, then u can have no local maximum in U. To obtain 
a maximum principle, we want to relax two conditions and still get the same 
conclusion. One is that we want to allow more general second-order terms in L, 
and the other is that we want to get a conclusion from knowing only that Lu(x) 
is > 0 everywhere. The first step is carried out in Lemma 7.11 below, and the 
second step will be derived from the first essentially by perturbing the situation 
in a subtle way. 


Lemma 7.11. Let L = ae, bij (x)D; Dj + ce(x) Dz, with [bij (x)] sym- 
metric, be a second-order linear elliptic operator with real-valued coefficients in 
an open subset U of R™ such that for every x in U, there is a number jz(x) > 0 
such that ay, bij (x)éiEj} = w(x)|€|* for all € € RY. If u is a real-valued C? 
function on U such that Lu > O everywhere in U, then u has no local maximum 
in U. 
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PROOF. Suppose that u has a local maximum at x9. Applying Proposition 7.5, 
we can find a nonsingular matrix M such that the definition Di = M;;D; 
makes the second-order terms of L at xo take the form «Di? +.+>+kKy Dki” 
with each x; equal to +1, —1, or 0. Examining the hypotheses of the lemma, we 
see that all «; must be +1. Hence the change of basis at xo via M converts the 
second-order terms of L into the form Di? pee tH Die: The argument in the 
example above is applicable at xo, and the lemma follows. 


Theorem 7.12 (Hopf maximum principle). Let 


L= yi @)DiDj a 2 eO)Dr, 
uJ 


with [b;;(x)] symmetric, be a second-order linear elliptic operator with real- 
valued continuous coefficients in a connected open subset U of R™. If u isa 
real-valued C2 function on U such that Lu = 0 everywhere in U, then u cannot 
attain its maximum or minimum values in U without being constant. 


PROOF. First we normalize matters suitably. We have | De j bij (x)E,§;| > 
w(x)|E|? with w(x) > 0 at every point. By continuity of the coefficients and 
connectedness of U, the expression within the absolute value signs on the left 
side is everywhere positive or everywhere negative. Possibly replacing L by —L, 
we shall assume that it is everywhere positive: 


Yo bi EE = WOVE? for all.x € U. 


iy 


Because of the continuity of the coefficients of L, the coefficient functions are 
bounded on any compact subset of U and the function jz(x) is bounded below by a 
positive constant on any such compact set. Since u can always be replaced by —u, 
a result about absolute maxima is equivalent to a result about absolute minima. 
Thus we may suppose that u attains its absolute maximum value M at some x, in 
U and we are to prove that u is constant in U. Arguing by contradiction, suppose 
that xp is a point in U with u(xp) < M. 

The idea of the proof is to use xp and x; to produce an open ball B with B*! C U 
and a point s in the boundary 0B of B such that u(s) = M and u(x) < M for all 
x in B‘! — {s}. See Figure 7.1. For a suitably small open ball B, centered at s, 
we then produce a C? function w on R™ such that Lw > 0 in B, and w attains 
a local maximum at the center s of B;. The existence of w contradicts Lemma 
7.11, and thus the configuration with xo and x; could not have occurred. 
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FIGURE 7.1. Construction in the proof of the Hopf maximum principle. 


Since U is a connected open set in R%, it is pathwise connected. Let 
p: [0,1] ~ U bea path with p(O) = xo and p(1) = x;. Let t be the first 
value of ¢ such that u(p(t)) = M; necessarily 0 < t < 1. Define x2 = p(t). 
Choose d > 0 such that B(d; p(t))*' C U for 0 < “i < t, and then fix a 
point X = p(t) with O < ft < t and with |x — x.| < sd. By definition of d, 
B(d; x)" CU. Let B be the largest open ball contained in U, Conic’ at x, and 
having u(x) < M forx € B. Since u(x2) = M and |x — Xa < sd, B has radius 

< 5d. Thus B*! ¢ B(d; X)*! € U. The construction of B and the continuity of 
Uu fie some point s of the boundary 0B to have u(s) = M. Let B be any open 
ball properly contained in B and internally tangent to Bats. Then BS C BU {s}, 
and hence u(x) < M everywhere on B‘! except at s, where u(s) = M. Write 
B= B(R; x’). 

To construct B, fix Rj > O with R; < 5R, and let B} = B(R,; 5). If x is 
in BO, then |x —X| < |x —s|+|s —X| < Ri + 4d < 4R+ 4d < d, and 
hence Bo C B(d;x)" C U. Since B" and Bo are compact subsets of U, the 
coefficients of L are bounded on B“'U B“, and the ellipticity modulus is bounded 
below by a positive number. Let us say that 


Ibis <6, la@i<y, wa)>u>O  forxe BUUBT. 


The next step is to construct an auxiliary function z(x) on R% to be used in 
the definition of w(x). Let a be a (large) positive number to be specified, and set 


—alx—x'|? en eR? 


zZ(x) =e 


The function z(x) is > 0 on B, is 0 on 2B, and is < 0 off B“. Let us see that 
we can choose a@ large enough to make L(z)(x) > 0 for x in B;. Performing the 
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differentiations explicitly, we obtain 
L(@)(x) = 2ae*#—*P (2a J dix V(x; — ¥/) G4; — 24) 
ij 


— > (bie) = exe) (x - x9)) 
k 


> ae"! (Deux xP —(B+ylx —x’')). 
All points x in By have SR <|x—-x'| < 3R and therefore satisfy 


L(g) (x) > Que" (Log R? — (B + 37). 


Consequently we can choose a@ large enough so that L(z)(x) > 0 for x in By. Fix 
a with this property. 
Let € > 0 be a (small) positive number to be specified, and define 


w=u+t ez. 
For x in B}, we have Lw = Lu + €Lz > 0. Also, 
w(s) = u(s) +€z(s) =u(s) = M since s isin OB. 
Let us see that we can choose € to make w(x) < M everywhere on 0B,. We 
consider 0 B; intwo pieces. Let Co = 0 By MB*". Since Co is a subset of B“'— {s}, 
u(x) < M at every point of Co. By compactness of Co and continuity of u, we 


must therefore have u(x) < M — 6 on Co for some 6 > QO. Since the function 
z(x) is everywhere < 1 — enh any x in Cp must have 


w(x) = u(x) + ez(x) < M—8 +e(1 —e**’), 


By taking € small enough, we can arrange that w(x) < M — 56 on Co. Fix such 
an €. The remaining part of 0B, is 0B, — Co. Each x in this set has 


w(x) = u(x) + €z(x) < M+ez(x) < M. 


Thus w(x) < M everywhere on 0 B,, as asserted. 

Since w(s) = M and w(x) < M everywhere on 0B), w attains its maximum in 
Bo somewhere in the open set B;. Since Lw > 0 on B1, we obtain a contradiction 
to Lemma 7.11. This completes the proof. 
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5. Parametrices for Elliptic Equations with Constant Coefficients 


In this section we use distribution theory to derive some results about an elliptic 
equation P(D)u = f with constant coefficients. Initially we work on RY, yet 
in the end we will be able to work on any nonempty open set. We think of f as 
known and u as unknown. But we allow f to vary, so that we can see the effect 
on u of changing /f. It will be important to be able to allow solutions that are not 
smooth functions, and thus u will be allowed to be some kind of distribution. 

We begin by obtaining a parametrix, which at first will be a tempered distri- 
bution that approximately inverts P(D) on S’(R”). More specifically it inverts 
P(D) on S‘(R) up to an error term given by an operator equal to convolution 
with a Schwartz function. 

At this point we can use the version Theorem 7.4 of the Cauchy—Kovalevskaya 
Theorem to obtain a fundamental solution, i.e., a member u of D’(R”) such 
that P(D)u = 6. This step is carried out in Corollary 7.15 below. Convolution 
of P(D)u = 6 with a member f of €’(R™) shows that Corollary 7.15 implies a 
global existence theorem: any elliptic equation P(D)u = f with f in €’(R™) 
has a solution in D’/(R"). 

But it is not necessary, for purposes of examining regularity of solutions, to 
have an existence theorem. The next step is to modify the parametrix to have 
compact support. Once that has been done, the parametrix will invert P(D) 
on D’(R%), up to a smoothing term, and we will deduce a regularity theorem 
about solutions saying that the singular support of u is contained in the singular 
support of f. In particular, solutions of P(D)u = 0 on RN are smooth. Finally 
we localize this result to see that the inclusion of singular supports persists even 
when the equation P(D) = f is being considered only on an open set U. 

The starting point for our investigation is the following lemma. 


Lemma 7.13. If P(D) is an elliptic operator with constant coefficients, then 
the set of zeros of P(27ié) in R% is compact. 


REMARK. The polynomial P(27ié&) is the symbol of P(D), as defined in 
Section 2. The important fact about the symbol is that the Fourier transform 
satisfies F(P(D)T) = P(2mié)F(T), which follows immediately from the 
formula F(D“T) = (2mi)!*'E“ F(T). This fact accounts for our studying the 
particular polynomial P(27i€). 


Proor. Let P have order m, and let Z be the set of zeros of P(27ié) in R™. 
Since P(D) is elliptic, the principal symbol P,,, (27ri&) is nowhere 0 on the unit 
sphere of RY. By compactness of the sphere, | P;, (27 i€)| > c > O there, for some 
constant c. Taking into account the homogeneity of P,,, we see that | P,(27i€)| > 
clé| for all € in RX. If we write P(2wié) = P,(2mié) + O(2mié), then 


5. Parametrices for Elliptic Equations with Constant Coefficients 301 


O(2mié)| < Clé|"—! for |E| > 1 and for some constant C. If & is in Z and 
|&| => 1, then we have cl&|”" < Py, (2xié)| = |QQmié)| < Clé|"', and we 
conclude that |&| < C/c. This proves the lemma. 


Fix an elliptic operator P(D), and choose R > 0 by the lemma such that all the 
zeros in R% of P(27i€) lie in the closed ball of radius R centered at the origin. 
Fix numbers R’ and R” with R’ > R” > R. Let x be a smooth function on R” 
with values in [0, 1] such that x (€) is O when |&| < R” and is 1 when |&| > R’. 
The formal computation is as follows: if we define v in terms of f by 


ce FUE) 
= 2mix-& 
i= Lez Pom KOM. 


then Fourier inversion gives 
(P(D)v)(x) = i . ent S Ef )(E) x (E) dé 
= f(x) + i: ' em 5 (¥(E) — 1) F(f)(E) dé, 


and the second term on the right side will be seen to be a smoothing term. Let 
us now state a precise result and use properties of distributions to make this 
computation rigorous. 


Theorem 7.14. Let P(D) be an elliptic operator on R™ with constant coef- 
ficients. Then there exist a distribution k € S’(R™) and a Schwartz function 
he F*(C&,(R*)) such that 


P(D)k =8+T, 


as an equality in S’(R”). Here 6 is the Dirac distribution (5, g) = (0). Con- 
sequently whenever f is in €’(IR%), then the distribution v = k * f is tempered 
and satisfies P(D)v = f + (hx f). 


REMARKS. The convolution operator f +> k « f is called a parametrix for 
P(D) on €'(R%). More precisely it is a right parametrix, and a left parametrix 
can be defined similarly. The operator f +> hx f is called a smoothing operator 
because h « f is in C°(R%) whenever f is in €’(R%). To see the smoothing 
property, we observe that h, as a Schwartz function, is identified with a tempered 
distribution when we pass to 7;,,. Theorem 5.21 shows that 7), * f is a tempered 
distribution with Fourier transform F(h)F(f). Both factors F(h) and F(f) are 
smooth functions, and F(h) has compact support. Therefore F(h « f) is smooth 
of compact support, and h x f is a Schwartz function. 
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PROOF. The function o(€) = x(&)/P(27ié) is smooth and is bounded on 
IR’ because, in the notation used in the proof of Lemma 7.13, |P(27ié)| > 
| Pm (27i€)| —|Q(2niE)| = (cl&| — C)|E |"! and because (clE|—C)|§ |"! > 1 
as soon as [&| is large enough. Since o is bounded, integration of the product of « 
and any Schwartz function is meaningful, and T, is therefore in S ‘(R”). Define 
k = F-\(T,). This is in S’(R") and has F(k) = T,. Define h = F~'(x — 1). 
Since x — 1 is in C3, (RY), A is in S(RX). 

Now let f in €’(R) be given, and define v = k x f. Theorem 5.21 shows 
that v is in S’(R”) and that F(v) = F(k)F(f) = oF(f). Then 


F(P(D)v) = PQmié)F(v) = P2nié)o(E)F(f) 
= xQ@)FP/) = FP) + xX) — DFP) = FP) + FFP). 


Taking the inverse Fourier transform of both sides yields P(D)v = f +h f 
as asserted. For the special case f = 6, we have v = k x 6 = k, and then 
P(D)k = 6+ T),. This completes the proof. 


The function h is the inverse Fourier transform of a member of C poem dls ), 
specifically h(x) = fn etx (y¥(£) — 1)dé. Since the integration is really 
taking place on a compact set, we see that we can replace x by a complex variable 
z and obtain a holomorphic function in all of C’. In other words, h extends 
to a holomorphic function on C’. If we single out any variable, say x,, then 
the ellipticity of P(D) implies that D7! has nonzero coefficient in P(D), and 
P(D)w = his therefore an equation to which the global Cauchy—Kovalevskaya 
Theorem applies in the form of Theorem 7.4. The theorem says that the equation 
P(D)w = h, in the presence of globally holomorphic Cauchy data, has not just a 
local holomorphic solution but a global holomorphic one. Therefore P(D)w = 
h has an entire holomorphic solution w. Let us regard w and h as yielding 
distributions 7,, and 7; on C AIRY ), so that the equation reads P(D)T,, = Th. 
Subtracting this from P(D)k = 6 + Ty yields P(D)(k — Ty) = 6. In summary 
we have the following corollary. 


Corollary 7.15. If P(D) is an elliptic operator on R% with constant coeffi- 
cients, then there exists e in D’/(R”) with P(D)e = 6. 


The distribution e is called a fundamental solution for P(D) in D’(RY). 
A consequence of the existence of e is that P(D)u = f has a solution u in 
D'(R™) for each f in €'(R™). This represents an improvement in the conclusion 
(fundamental solution vs. parametrix) of Theorem 7.14. 

Think of Corollary 7.15 as being an existence theorem. We now turn to a 
discussion of the regularity of solutions. For this we do not need the existence 
result, and thus we shall proceed without making further use of Corollary 7.15. 
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Proposition 7.16. Let P(D) be an elliptic operator on R% with constant 
coefficients. Then the tempered distribution k = F-'(T,), where o(€) = 
x(€)/P (27ié), is a smooth function on R* — {0}. Therefore, for any neigh- 
borhood of 0, the elliptic operator P(D) has a parametrix ky € €’(R%) with 
compact support in that neighborhood. In particular, there is a smooth function 
h, with support in that neighborhood such that whenever f is in €’(R”), then 
the distribution v = ko * f is in €’(R) and satisfies P(D)v = f + (hi * f). 


SKETCH OF PROOF. One checks that 
DPEtk) = Grill ani FE Tae) 


Here €’D%o is a C® function, and we are interested in its integrability. It is 
enough to consider what happens for |&| > R’, where o (§) = 1/P(27i&). The 
function 1/P(27ié) is bounded above by a multiple of |&|~”, and an inductive 
argument on the order of the derivative shows that |’ D%a| < C|é|!8!-!¢!-™ for 
|| => R’, for a constant C independent of é. 

Take 6 = 0. If |a| is large enough, we see that D%o is in L'(R”). Then 
F-'(D%o) = (2ri)''E%k is given by the usual integral formula for F, but with 
e7'x5 replaced by e7'*5. Therefore €%k is a bounded continuous function 
when |q| is large enough. Applying this observation to Ca 1; |7! )k for large 
enough /, we find that k is a continuous function on R™ — {0}. 

Next take |8| = 1 and increase / by 1, writing a’ for the new a. Then &* D“o 
is integrable, and it follows> that €“k has a pointwise partial derivative of type 6 
and is continuous. Thus the same thing is true of k on R™ — {0}. 

Iterating this argument by adding 1 to one of the entries of 6 to obtain f’, 
we find for each f that we consider, that the functions D? ( wea lEj|7)k and 
DP ( ae lg; 7")k are integrable for /’ sufficiently large, and we deduce that D’k 
has all first partial derivatives continuous. Since f’ is arbitrary, k equals a smooth 
function on R% — {0}. 

To finish the argument, let k and / be as in Theorem 7.14, and let y inC&_(R™) 
be identically 1 near 0 and have support in whatever neighborhood of 0 has been 
specified. If we writek = wk+(1—w)k, then ky = wk has support in that same 
neighborhood, and T = (1 — y)k is of the form 7), for some smooth function 
ho, by what we have shown. Substituting k = kp + Ty, into P(D)k = 5+ Th, we 
find that P(D)ko = 5+ 7, —Tp(pyny. The function hy = h — P(D)ho is smooth, 
and it must have compact support since P(D)ko and 6 have compact support. 


Corollary 7.17. If u is in D’(R%) and P(D) is elliptic, then sing suppu C 
sing supp P(D)u, where “sing supp” denotes singular support. 


>The precise result to use is Proposition 8.1f of Basic. 
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REMARK. At first glance it might seem that the rough spots of P(D)u are 
surely at least as bad as the rough spots of u for any D. But consider a function 
on R? of the form u(x, y) = g(y) and apply P(D) = 0/0x. The result is 0, and 
thus sing supp u can properly contain sing supp P(D)u for P(D) = d/dx. The 
corollary says that this kind of thing does not happen if P(D) is elliptic. 


PROOF. Let E = (sing supp P(D)u)°. By definition the restriction of P (D)u 
to C&S, (E) is of the form Ty with y in C°(E). Let U be any nonempty open 
set with U‘' compact and with US! C E. It is enough to exhibit a smooth 
function n equal to u on U. Choose an open set V with V“ compact such that 
U' CV CV" C E. Multiply y by a smooth function of compact support in E 
that equals 1 on V“, obtaining a function wo € C Som(E) such that yo = wonV. 

Choose an open neighborhood W of 0 such that W = —W and such that the 
set of sums U‘!+ W* is contained in V. Applying Proposition 7.16, we can write 
P(D)ky = 6 +h’! with ko € E’(R®) and h’ € CX (RY). The proposition allows 
us to insist that the support of ky be contained in W. Then also h’ has support 
contained in W. 

We are to produce n € C™®(U) with (T,, ¢) = (u, ¢) for all 9 € CSU). 
Our choice of W forces ky * gy to have support in V. Hence 


(koxP(D)u, y) = (P(D)u, ky #9) = (Ty, ky #9) = (Typ ko #2) = (ko* Wo, 9). 
On the other hand, application of Corollary 5.14 gives 
(ko * P(D)u, p) = (P(D)ko * u, g) = (6 +h’) * u, g) = (u, g) + (h' *U, @). 


Combining the two computations, we see that (u, ) = (ko * Wo — hh’ xu, g), and 
the proof is complete if we take 7 to be kp * Wo — hh’ xu. 


The final step is to localize the result of Corollary 7.17. 


Corollary 7.18. If P(D) is elliptic with constant coefficients, if U is nonempty 
and open in IR , and if wu and f are members of D’(U) with P(D)u = f, then 
sing suppu C sing supp f. Consequently if f is a smooth function on U, then 
SO is U. 


REMARKS. For the Laplacian this result gives something beyond the results in 
Chapter III: Part of the statement is that any distribution solution u of Au = 0 on 
an open set U equals a smooth function on U. Previously the best result of this 
kind that we had was Corollary 3.17, which says that any distribution solution 
equal to a C function is a smooth function. 
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PROOF. It is enough to prove that EF sing suppu C EM sing supp f for each 
open set E with E“! compact and E‘! C U. Choose w in C®_(U) with w equal 


com 


to lon E“!. The equality (wu, g) = (u, We) = (u, g) forall g € C&,(E) shows 
that EM sing suppu = E M sing supp wu. Regard yu as in €’(IR%), and define 
g = P(D)(wu). Both wu and g are in €’(R*), and every g € C®._(E) satisfies 


com 


(g,.9) = (P(D)(wu), ) = (yu, P(D)"¢) 
= (u, P(D)"9) = (P(D)u, 9) = (f,¢). 


Hence E / sing suppg = EM sing supp f. Application of Corollary 7.17 
therefore gives 


E sing suppu = E M sing supp wu C EN sing supp g = E sing supp f, 


and the result follows. 


6. Method of Pseudodifferential Operators 


Linear elliptic equations with variable coefficients were already well understood 
by the end of the 1950s. The methods to analyze them combined compactness 
arguments for operators between Banach spaces with the use of Sobolev spaces 
and similar spaces of functions. Those methods were of limited utility for other 
kinds of linear partial equations, but some isolated methods had been developed 
to handle certain cases of special interest. In the 1960s a general theory of 
pseudodifferential operators was introduced to include all these methods under 
a single umbrella, and it and its generalizations are now a standard device for 
studying linear partial differential equations. They provide a tool for taking 
advantage of point-by-point knowledge of the zero locus of the principal symbol. 

As with distributions, pseudodifferential operators make certain kinds of cal- 
culations quite natural, and many verifications lie behind their use. We shall omit 
most of this detail and concentrate on some of the ideas behind extending the 
theory of the previous section to variable-coefficient operators. 

We start with a nonempty open subset U of R and a linear differential operator 
P(x, D)= ay Aq(x)D* whose coefficients a,(x) are in C~@(U). If u is in 
CSm(U), we can regard u as in Coy (R%). The function u is then a Schwartz 


com 
function, and the Fourier inversion formula holds: 


ees / PERE) dE, 
RY 
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where i is the Fourier transform 7(€) = fox e7'x54(x) dx. Applying P gives 


P(x, Dyu(x) = 7 da(x) (2m)! i eT ECT (E) dE 
RN 


la|<m 


= / aoa | DY aux) Qxri)'*E")(E) dé = i eT'*5 P(x, iE )U(E) dE, 
RY Re 


|a|<m 


where P(x, 27rié) is the symbol. The basic idea of the theory is to enlarge the 
class of allowable symbols, thereby enlarging the class of operators under study, 
at least enough to include the parametrices and related operators of the previous 
section. The enlarged class will be the class of pseudodifferential operators. 

In the constant-coefficient case, in which P(x, 27ié) reduces to P(277&), 
what we did in essence was to introduce an operator of the above kind, at first with 
1/P (277) in the integrand in place of P(27ié) but then with x (€)/P (27ié) 
instead of 1/P(27ié) in the integrand in order to eliminate the singularities. 
When we composed the two operators, the result was the sum of the identity and 
a smoothing operator. 

In the variable-coefficient case, the operator we use has to be more com- 
plicated. Suppose that we want P(x, D)G = 1+ smoothing, with G given 
by the same kind of formula as P(x, D) but with its symbol g(x, €) in some 
wider class. If the equation in question is P(x, D)u = f, then our computation 
above shows that we want to work with P(x, OG Ee e2tix'5 a(x, E) F(E) dé). 
The effect of putting P(x, D) under the integral sign is not achieved by in- 
cluding P(x, 277é) in the integrand, because the product ert a(x, &) is being 
differentiated. A brief formal computation shows that D%(e?7'** g(x,&)) = 
ex5((D, + 2nié)”g(x,&)), where the subscript x is included on D, to 
emphasize that the differentiation is with respect to x. Thus we want 
P(x, Dy + 27i&)g(x, €) to be close to identically 1, differing by the symbol of 
a “smoothing operator.’ We cannot simply divide by P(x, D, + 27i&) because 
of the presence of the D,. What we can do is expand in terms of degrees of 
homogeneity in € and sort everything out. When degrees of homogeneity are 
counted, €* has degree |w| while D, has degree 0. Expansion of P gives 


m—1 


P(x, Dy + 2m1§) = Py (x, 201) + SS pj(x,§, Dx), 
j=0 


where P,, is the principal symbol and p; is homogeneous in & of degree j. No 
D, is present in P,, because degree m in € can occur only from terms (27 7&)® in 
(D, + 27i&)*. Since the constant function of € has homogeneity degree 0 and 
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since degrees of homogeneity add, let us look for an expansion of g(x, &) in the 
form 


CO 
g(x,€) = > gi(x, 8), 
j=0 
with g; homogeneous in & of degree —m — j. Expanding the product 
(Pm (x, 218) + ep pe(x, &, Dy)) (Xo gj %. 8) = 1 
and collecting terms by degree of homogeneity, we read off equations 


Pm(x, 271€)go(x, €) = 1, 
Pri(x, 271€)g1(x, &) cls Pm-1(, g, Dy) go(x, &) — 0, 
Pin (x, 27 1€)go(x, &) + Pm-1(X, g, D,)gix, &) 7 Pm-2(X, g, Dy) go(x, &) = 0, 


and so on. Dividing each equation by P,,, (x, 27 i&), we obtain recursive formulas 
for the g(x, €)’s, except for the problem that P,, (x, 277i) vanishes for § = 0. To 
handle this vanishing, we again have to introduce a function like x (€) by which 
to multiply g;, and it turns out that in order to produce convergence, x has to be 
allowed to depend on j. After the g;’s have been adjusted, we need to assemble an 
adjusted g from them and form a right parametrix, namely the pseudodifferential 
operator G corresponding to symbol g(x, €) such that P(x, D)G = 1+ R, where 
R is a “smoothing operator.” 

To make all this at all precise, we need to be more specific about a class of 
symbols, about the definition of the corresponding pseudodifferential operators, 
about the recognition of “smoothing operators,’ and about the assembly of the 
symbol from the sequence of homogeneous terms. 

Fix a nonempty open set U in R, and fix a real number m, not necessarily 
an integer. The symbol class known as S7,(U) and called the class of standard 
symbols of order m consists of the set of all functions g in C°(U x R”) such 
that for each compact set K C U and each pair of multi-indices a and £, there 
exists a constant Cx yg with® 


ID? DEg(x,8)1<Crapdtlé)" ! forxe K, € eR’. 


Then DDE g will be a symbol in the class ea (OD Let S;Q°(U) be the 
intersection of all S$, 9(U) forn > 0. 


The symbol class S7'9(U) is not the historically first class of symbols to have been studied, but 
it has come to be the usual one. Classes S56) occur frequently as well, but we shall not discuss 
them. 
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EXAMPLES. 
(1) If P(x, D) = pare dq_D*% with all ag in C°(U), then its symbol 
P(x, 2mié) = Dime Ay (x)(2mi)IlE is in STio (VU). 


(2) If P(x, D) in Example 1 is elliptic, then the parametrix g(x, &) that we 
construct will be in Sg’ (U). 


(3) With P and g formed as in Examples | and 2, the error termr (x, €) such that 
P(x, Dy + 27ié) g(x, &€) = 14+r(x, €) will be in Svo (U). The corresponding 
pseudodifferential operator will be a “smoothing operator” in a sense to be defined 
below. 


To a standard symbol g, we associate a pseudodifferential operator G = 
G(x, D) first on smooth functions and then on distributions.’ The associated 
G:Co,,(U) > CU) for a symbol g € STV) is given by 


(Go)(x) = ih ' eS a(x EG(E)dE  forpeC&(U), x EU. 


One readily checks that G¢ is indeed in C® (U) and that G : CS. (U) > C°(U) 


com 


is continuous. The associated G : €’(U) > D’(U) is given by® 


of.0)= [Lf eres. oemds|rnede — for fe £"U). 
At U 


(Recall that F(f) is a smooth function, according to Theorem 5.20.) One readily 
checks that (G/f, g) is well defined, that Gf is in D’(U), and that when f = Ty 
for some y € CS. (U), then G(Ty) = Tey. 


com 
The error term in constructing a parametrix is ultimately handled by the fol- 


lowing fact: if g is a symbol in Sto (U), then G carries €’(U) into C®(U). For 
this reason the pseudodifferential operators with symbol in S| j°(U) are called 
smoothing operators. 

With the definitions made, let us return to the construction of a right parametrix 
for the elliptic differential operator P(x, D). Let us write pm(x,&, Dx) for 
the principal symbol P,,(x, 27i&) in order to make the notation uniform. The 


7Pseudodifferential operators can be used with other domains, such as Sobolev spaces, in order 
to obtain additional quantitative information. But we shall not pursue such lines of investigation 
here. Further comments about this matter occur in Section VIIL.8. 

8Our standard procedure for defining operations on distributions has consistently been to define 
the operation on smooth functions, to exhibit an explicit formula for the transpose operator on 
smooth functions and observe that the transpose is continuous, and to use the transpose operator 
to define the operator on distributions. This procedure avoids the introduction of topologies on 
spaces of distributions. In the present discussion of the operation of a pseudodifferential operator 
on distributions, we defer the introduction of transpose to Section VIII.6. 
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recursive computation given above produces expressions g¢;(x,&) for j => 0 such 
that 


Cas PRO, 5, Dx))( 25 gj(x, é)) =1 


in a formal sense. The actual g;(x, €)’s are not standard symbols because the 
formula for g;(x, €) involves division by (pm (x, £))/*+! and because pm (x, €) 
vanishes at € = 0. However, the product x; (€)g;(x, €) is a standard symbol if x j 
is a smooth function identically 0 near = 0 and identically 1 off some compact 
set. Thus we attempt to form the sum 


g(x,8) = >> x, g(x.) 
j=0 


J 


and use it as parametrix. Again we encounter a problem: we find that con- 
vergence is not automatic. More care is needed. What works is to define 
xj€@) = x(Rz1E|), where x : R — [0, 1] is a smooth function that is 0 for 
It| < 5 and is 1 for |t| => 1. One shows that positive numbers R; tending 
to infinity can be constructed so that the partial sums in the series for g(x, &) 
converge in C~°(U x R%) and the result is in the symbol class 5,9 (U). Let G 
be the pseudodifferential operator corresponding to g(x, €). 
A little computation shows that 


P(x, Dy + &)g(x,€) = 1 +r(x, &), 


where r(x, €) = —1 + xo(&) — Sori, &) 
j=l 
min{ j,m} 
and rj(X,€)= > [x;-¢(€) — Xj) Pm—e (x, €, Dy) 8j-K (x, §). 
k=1 


The function rj(x,&€) isin C°(U x IR”) and vanishes for |E| > R;. This fact, 
the identities already established, and the construction of the numbers Rj allow 
one to see that eas 417, &) isin $ 7 o(U). Since the remaining finite number 
of terms of r (x, §) have compact support in &, they too are in S; 4 (U) and then so 
is r(x, €). Since n is arbitrary, r(x, €) is in S$; 9°(U). Hence the corresponding 
pseudodifferential operator is a smoothing operator. Consequently we obtain, as 
an identity on CS (U) or on E'(U), 


com 
P(x,D)G=1+R 


with R a smoothing operator. Therefore G is a right parametrix for P(x, D). 
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From the existence of a right parametrix, it can be shown that P(x, D)u = f is 
locally solvable.’ If we could obtain a left parametrix, i.c., a pseudodifferential 
operator H with H P(x, D) = 1+ S for a smoothing operator S, then it would 
follow that singular supports satisfy 

sing supp u = sing supp f whenever f isin €’(U) and P(x, D)u = f. 
Inclusion in one direction follows from the local nature of P(x, D) in its action 
on u: sing supp f = sing supp P(x, D)u C sing supp uv. Inclusion in the reverse 
direction uses the “pseudolocal” property of any pseudodifferential operator and 
of H in particular, namely that sing supp Hf C sing supp /. It goes as follows: 


sing supp uv = sing supp (1 + S)u = sing supp H P(x, D)u 


= sing supp Hf C sing supp f. 

In particular, if f is in Co(U), then u is in C*(U). Constructing a left 
parametrix H with the techniques discussed so far is, however, more difficult 
than constructing the right parametrix G because we cannot so readily determine 
the symbol of H P(x, D) for a general pseudodifferential operator H. 

Let us again work with the general theory, taking g to be in Sj")(U) and 
denoting the corresponding pseudodifferential operator G : Cov, (U) SCP) 
by 


ox) =f et * ye. OEE — for y € CH _(U). 
R 
The distribution T7¢,, which we write more simply as G¢@, acts on a function w 
in CX (U) by 
(Ge, v) = fan fy 7 a(x, EDWO)GE) dx dé 


= few fy fy Or ga, EW) O(y) dy dx dé. 
If we think of w(x)g(y) as a particular kind of function w(x, y) inCo&(U x U), 


com 
then we can extend the above formula to define a linear functional G on all of 


CSU x U) by 
auy= fo Lf ertacs, Eyw(s, yy dxay] ds. 
NN‘ JUxU 


It is readily verified that G is continuous on CX,(U x U) and hence lies in 


D'(U x U). The expression written formally as 
Gtr.) = feo gex, Ba 
RY 


is called the distribution kernel of the pseudodifferential operator G. This 
expression is not to be regarded as a function but as a distribution that is evaluated 
by the formula for (G, w) above. 

The first serious general fact in the theory is as follows. 


°More detail about this matter is included in Section VIIL8. 
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Theorem 7.19. If G is a pseudodifferential operator on an open set U in R™, 
then the distribution kernel G of G is a smooth function off the diagonal of U x U, 
and G is pseudolocal in the sense that 


sing supp Gf C sing supp f for all f € E’(U). 


We give only a few comments about the proof, omitting all details. The first 
conclusion of the theorem is proved by using the known decrease of the derivatives 
of g(x, &). For example, to see that G is given by a continuous function, one uses 
the decrease of Df? g(x, &) in the & variable to exhibit (x — y)°G, for |a| > m+N, 
as equal to a multiple of the continuous function fpy e7!~)”* Dé g(x, €) dé. 
The second conclusion of the theorem, the pseudolocal property, can be derived 
as a consequence by using an approximate-identity argument. 

To establish a general theory of pseudodifferential operators, the next step is 
to come to grips with the composition of two pseudodifferential operators. If we 
have two pseudodifferential operators G and H on the open set U, then each maps 
CSU) into C°(U), and their composition G o H need not be defined. But the 
composition is sometimes defined, as in the case that H is a differential operator 
and in the case that H is replaced by w(x)H, where y is a fixed member of 
ce _(U). Thus let us for the moment ignore this problem concerning the image 
of H and make a formal calculation of the symbol of the composition anyway. 
Say that G = G(x, D) and H = H (x, D) are defined by the symbols g(x, &) 
and h(x, €). Substituting from the definition of H (x, D)g(x) and allowing any 
interchanges of limits that present themselves, we have 


G(x, D)H (x, D)g(x) = G(x, D) fan €77* h(x, E)@(E) dé 

= fan G(x, Dye"? h(x, EGE) dé 

= fan O78 (eEG(x, Dy le FEh(x, EGE) dE. 
This formula suggests that the composition J = G o H ought to be a pseudo- 
differential operator with symbol 

iO, &) = eS G(x, Dy le" h(x, €)] 
= erin’ fy Pg (x, ple **8h(x,E)] (n)dn. 

Let us suppose that the Fourier transform of h(x, &) in the ; first variable is mean- 


ingful, as it is when h(- , €) has compact support. Write h(- , €) for this Fourier 
transform. Then the above expression is equal to 


fan C7 OS 9 (x, mh(n — €,€)dn = fon 7 * "g(x, n + E)h(y, Edn. 
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If we form the infinite Taylor series expansion of g(x, 7 + €) about 7 = 0 and 
assume that it converges, we have 


ge, n+&) = De a Digi, &) 0". 


Substituting and interchanging sum and integral, we can hope to get 


JOC, 8) = Dy A fon 7” DE g(x, En hy, £) dn 
=, 22 De etx, &) fan 27 D8h) (n, &) dn. 


In view of the Fourier inversion formula, we might therefore expect to obtain 


Oni) la! 
jee=>0 = DE e(x, €)D&h(x, €). 


! 
a 


We shall see that such a formula is meaningful, but in an asymptotic sense and 
not as an equality. 
This discussion suggests four mathematical questions that we want to address: 


(i) If we are given a possibly divergent infinite series of symbols as on the 
right side of the formula for j (x, €) above, how can we extract a genuine 
symbol to represent the sum of the series? 

(ii) Put G@, D, + €)p(x) = Spe e7* (x, 7 + €)@(n) dn. In what sense 
of ~ is it true that G(x, Dy + £)9(x) ~ Dy P= De g(x, £) DZ y(x)? 

(iii) How can we handle the matter of compact support? 

(iv) How can we show, under suitable hypotheses that take (111) into account, 
that j(x, &) is given by G(x, D, + & )(h (x, & )) and therefore that we 
obtain a formula from (ii) for j (x, €) involving ~ ? 


The path that we shall follow is direct but not optimal. In Section VIII.6 we shall 
take note of an approach that is tidier and faster, but insufficiently motivated by 
the present considerations. 

Question (i) is fully addressed by the following theorem. 


Theorem 7.20. Suppose that {m;}j+0 is a sequence in R decreasing to —oo, 
and suppose for j > O that g;(x, €) is a symbol in HAC ). Then there exists a 
symbol g(x, ) in Si")(U) such that for all n > 0, 


n—-1 


g(x,€)— > gj(x,&) isin Si'5(U). 


j=0 


The theorem is proved in the same way that we constructed a right parametrix 
for an elliptic differential operator earlier in this section. We can now give a 
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precise meaning to ~ in terms of a notion of an asymptotic series. If {mj}j>0 
is a sequence in R decreasing to —oo, if g(x, €) is a symbol in Sto), and if 
gj(x, €) is a symbol in SoU) for each j > 0, then we write 


g(x,8)~ >> g(x, §) 
j=0 


if for alln > 0, 


n—1 
g(x,€)— >) g(x, &) isin S"6(U). 


j=0 


If the given sequence {m;}j>o is a finite sequence ending with m,, we can 
extend it to an infinite sequence with g;(x, €) = 0 for j > r, and in this case the 
definition of ~ is to be interpreted to mean that g(x, €) — a i gj(x, &) is the 
symbol of a smoothing operator. 

For (ii), we have just attached a meaning to ~. We define G(x, D. +€) g(x) = 
Jpn e*7'* g(x, n + €)@(n) dn. The precise statement that is proved to yield the 
asymptotic expansion of (ii) is the following. 


Proposition 7.21. Let U be open in RY, fix g in STU ), and let K bea 
compact subset of U. Then for any nonnegative integers M and R such that 
R>m-+QN, there exists a constant C such that 


|G, Dy +) 900) — Ligien PI DE g(x, E) D200] 


+ Viaten El" * sup, [[D° eI + léllx — yD ]} 


for all g in C®, all x in K, and all € with |&| > 1. 


We shall not make further explicit use of this proposition. The proof of the 
result is long, and we omit any discussion of it. 

We turn to questions (iii) and (iv). Question (iii) is addressed by a definition 
and some remarks concerning it, and question (iv) is addressed by the theorem 
that comes after those remarks. Continuing with our pseudodifferential operator 
G on the open set U , we say that G is properly supported if the subset support(G) 
of U x U has compact intersection with K x U and with U x K for every compact 
subset K of U. See Figure 7.2. 
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FIGURE 7.2. Nature of the support of the distribution kernel 
of a properly supported pseudodifferential operator. The 
open set U in this case is an open interval, and the oval- 
shaped region represents support(G). The shaded region 

is an example of a set (U x K)/M support(G). 


Suppose that G is properly supported, K is compact in U,and g isinC&(U) 
with support contained in K . Introduce projections p;(x, y) = x and po(x, y) = 
y. Define L = pi((U x KN support(G)); the set L is compact since G is 
properly supported and since the continuous image of a compact set is compact. 
Let us see that Gg has support contained in L. To do so, we write y ® @ for the 


function (x, y) > &(x)g(y), and then we have 
(GeV) = Saw Sy Sp Or 8, EU) G(y) dy dx dé = (G, W @ 9). 


If w is in C2’ (L° NU), then F = p; | (support Wn p> (support gy) is the 


com 


compact support of y @ g, and 
F Msupport(G) © py (L9) AU x K) support) = py '(L°)N py; (L) = @. 


Thus (G, W @ g) = 0, (Gg, Ww) = 0, and G¢ is supported in L. 

Thus the properly supported pseudodifferential operator G carries C3o,(U) 
into itself, and Lemma 5.2 shows that it does so continuously. Then G is 
continuous also as a mapping of the dense vector subspace CS, (U) of C*(U) 
into C°(U). Because of the completeness of C°(U), G extends to a continuous 
map of C°(U) into itself. 

Similarly one checks that any properly supported pseudodifferential operator 
carries €’(U) into E'(U). Therefore the composition G o H of two pseudodiffer- 
ential operators, whether regarded as acting on C&°_(U) or as acting on €’(U), 


is well defined if H is properly supported. 
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Theorem 7.22. Let U be an open subset of RY. 

(a) If G is a pseudodifferential operator on U, then there exists a properly 
supported pseudodifferential operator G* on U such that G — G* is in Sto (U), 
hence such that G — G* is a smoothing operator. 

(b) If G and H are properly supported pseudodifferential operators on U with 
symbols g in Sj')(U) and h in Sr. ), then G o H is a properly supported 
pseudodifferential operator with symbol j in Sto. ™’(U), and 


Oni) la! 
jee~yd = DE g(x, €)D@h(x, €). 


a 


All that is needed from (b) in many cases is the following weaker statement. 


Corollary 7.23. Let U be an open subset of R”. If G and H are properly 
supported pseudodifferential operators on U with symbols g in Sj')(U) and A in 


Sr, ), then G o H is a properly supported pseudodifferential operator whose 
symbol j (x, €) is in Sto ™’(U) and has the property that 


5%, &) — g(x, Eh, €) 
is a symbol in sre): 


This is enough of the general theory so that we can see how to prove a the- 
orem with consequences beyond the subject of pseudodifferential operators. A 
pseudodifferential operator G on U with symbol g(x, €) in So (U) is said to be 
elliptic of order m if for each compact subset K of U, there are constants Cx 
and Mx such that 


Is@,6)|2CKQA+lé)”" — forx € K and |§| > Mx. 


In particular, an elliptic differential operator of order m satisfies this condition. A 
(two-sided) parametrix H for a properly supported pseudodifferential operator 
G with symbol g € Sj’)(U) is a properly supported pseudodifferential operator 
H of order —m such that Ho G = 1+ smoothing and Go H = 1 + smoothing. 


Theorem 7.24. If G is a properly supported elliptic pseudodifferential operator 
of order m, then G has a parametrix H. 


REMARKS. We saw in Theorem 7.19 that sing supp Gf C sing supp f for f in 
E'(U). The same argument as with the left parametrix before that theorem shows 
now from the parametrix of Theorem 7.24 that sing supp Gf > sing supp f and 
therefore that sing supp Gf = sing supp f for f in€’(U). In particular, solutions 
of elliptic equations are smooth wherever the given data are smooth. 
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PARTIAL PROOF. Let p : U x R” — [0,1] be a smooth function with the 

properties that 
(i) p equals | in a neighborhood of each point (x, €) where g(x, €) = 0, 
(ii) for each compact subset K of U, there is aconstant Tx such that p(x, €) = 
0 for x in K and |é| > Tx. 
We omit the verification that exists and is the symbol of a smoothing operator. 
Put 
ho(x, €) = (1 — p(x, &)) g(x, 8). 

This is a smooth function by (i), and we omit the step of checking that ho is 
in S; 9 (U). Let Ho be the pseudodifferential operator with symbol ho. Apply 
Theorem 7.22a to find a properly supported Hi? whose symbol hi, has hi, ~ ho. 
We write hi, = ho + ro with ro in So (U). 

Corollary 7.23 shows that Hi G is a well-defined properly supported operator 
whose symbol jo(x, &) is in SPU ) and has the property that jo — hig is in 
S74 e Sinee 


jo — hog = jo — (ho +170) = jo —[ — p)g~' +rolg = jo —1+ 0 —1r08 


and since p and rog are the symbols of smoothing operators, jg — 1 must be in 
Sy, . (U). Therefore H*G = 1+ R for a pseudodifferential operator R whose 
symbol r is in S;9(U). 

The equality HG = 1+ R shows that R is properly supported. By Corollary 
7.23, R* is a properly supported pseudodifferential operator for all integers k > 1, 
and its symbol r, is in Sj , i (U). We form the asymptotic series 


l—rptm—rgt::: 


and use Theorems 7.20 and 7.22a to obtain a properly supported pseudodifferential 
operator E whose symbol is in SPU ) and has 


iad a ae ee (*) 


For any integer n > 1, we have 


(1—R+R?—R?+---+R""')HIG 
=(1—R+R*— R?4---£R"')(1 +R) =1 ER". (x) 


Because of (*), E — (1 — R+ R? — R? +--+. + R"“') has symbol in S75 (U). 
Since the symbol jo of Hi G isin SoU ), the product 


(E-(1—R+R*—R*>+---+R""'))H}G _ has symbol in S;5(U). 
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Also, (««) implies that 
(L—R+R?—R?+---4R""')H}G—1= FR" has symbol in S;5(U). 


Adding shows that 
EHj}G—1 has symbol in S,4(U). 


Since n is arbitrary, EH} G — 1 is a smoothing operator. Thus H = EH# isa 
left parametrix for G. 

In similar fashion we can use the assumption “properly supported” to obtain a 
right parametrix H for G. We omit the details. The operators H and H give us 
equations 1 a 

HG=1+%S and GH=1+S 


for suitable properly supported smoothing operators S and S. Computing the 
product HGH in two ways shows that 
HGH =(1+S)H =H+SH =H + smoothing 
and HGH = H(1+S)=H+HS =H + smoothing. 
Hence H = H + Sy with So properly supported smoothing. Consequently 
GH =GH + GS) =1+5+ GS = 1+ smoothing, 


and the left parametrix H is also a right parametrix. 


BIBLIOGRAPHICAL REMARKS. The proof of Theorem 7.7 is adapted from 
Taylor’s Pseudodifferential Operators, and the proof of Theorem 7.12 is taken 
from the book by Bers, John, and Schechter. The approach to pseudodifferential 
operators used in Section 6 is now considered outdated, and a more streamlined 
approach requiring additional motivation appears in Section VHI.6. 


7. Problems 


1. Suppose that P(x, D) = ae Ay (x)D* with each dg in C™®(Q). Prove that 
if P(x, D)u = 0 for all functions u € C’” ({2), then all the coefficients a, are 0. 


2. (Harmonic measure) Let Q be a bounded nonempty connected open subset 
of RY, let Q be its boundary 9Q = Q*! — Q, and let L be an elliptic linear 
differential operator on Q of the form L(u) = ij bj (x) Dj Dju+ >=, ce(x) Dgu 
with real-valued coefficients of class C* such that b; j(x) = bj (x) for alli and j. 
Let S be the vector subspace of real-valued continuous functions u on Q*! such 
that Lu(x) = 0 for all x € Q. Prove for each point p in Q that there exists a 
Borel measure 2, on 0&2 with w,(dQ) = | such that u(p) = es u(x) duy(x) 
for all u in S. 
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3. 
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This problem identifies a fundamental solution of the Cauchy—Riemann operator 
in R?. It makes use of Green’s Theorem, which relates line integrals in R* with 
double integrals, for an annulus centered at the origin. 

(a) ForginC® (R?2), let P(x, y) = 2S: and O(a, y) = $22. Prove that 


com x2+y2 x+y 
lime 0 $xyyae (P dx + Q dy) =0. 


: : : a aP Px —XPy 
(b) With P and Q as in (a), verify that 22 eS 2 = ae ‘ 


(c) Conclude from (a) and (b) that [fg “sp 3° dx dy = 0. 
(d 


Repeat (a) with P(x, y) = ey and Q(x, y) = “ge, showing that 
lime jo fic, is (P dx + Qdy) = 2m¢@(0,0) if the line integral is taken 
counterclockwise around the circle. 


(ec) With P and Q as in (d), verify that 32 — 9% = “Aer 


ma 


wm 


dy by? 
(f) Conclude from (d) and (e) that f‘fp2 ee = —27 (0,0). 


(g) Conclude from (c) and (f) that + ff. + dx dy = —g(, 0). 

(h) Let T be the locally integrable function 1 i (27z), regarded as a member of 
D’(R’). Prove that 4(T) = 6. 

On R!, the Heaviside distribution H is the distribution given by the Heaviside 

function H (x) equal to 1 for x > 0 and to 0 for x <0. 

(a) Prove that D,H = 6, so that H is a fundamental solution for the elliptic 

operator D, on R!. 

Show that the function f(x) = max{x, 0} on Q = (—1, 1) has the Heaviside 

function as weak derivative on Q and that f is in Ey (Q2) for every p with 

l1<p<o. 

(c) Does the restriction of the Heaviside function to Q = (—1, 1) have a weak 

derivative on (2? Why or why not? 

Show that the distribution H x 6 on R? given by (H x5, 9) = tee g(x, 0) dx 

for g € CS, (R*) is a fundamental solution of the operator D, on R?. 

(e) Find the support and the singular support of the distribution H on R! and of 
the distribution H x 6 on R?. 


Let U be an open set in R™ containing 0, let f be in €’(U), and let P(D) be 
a linear differential operator with constant coefficients and with order > 1. By 
taking into account the theory of periodic distributions in Problems 12-13 of 
Chapter V and by suitably adapting the proof that Lemma 7.8 implies Theorem 
7.7, prove that the equation P(D)u = f has a distribution solution in some 
neighborhood of 0. 


(b 


wm 


(d 


w 


Problems 6-9 prove the global version of the Cauchy—Kovalevskaya Theorem given 
as Theorem 7.2 for the linear constant-coefficient case. The result is an ingredient 
used in deriving Corollary 7.15 from Theorem 7.14. For the statement the domain 
variables are t and x with x = (x1,...,xXw), and the unknown functions are the p 
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components of a function u(t, x) with values in C?. Write D, for a vs dt and D; for 
C) / dx;. The Cauchy problem in question is 


Diu = ee AjDju+ But F(t, x), 
u(0,x) = g(x), 


where A; and B are p-by- p matrices of complex constants, F is an entire holomorphic 
function from C%t! to C?, and g is an entire holomorphic function from C% to C?. 
The conclusion is that the unique formal power-series solution of the Cauchy problem 
converges and defines an entire holomorphic function from C%*! to C? that solves 
the problem. For a vector v = (v1,..., vp) in C?, let ||v||,, = max {lvil, sia lvp|}. 


6. 


Let a denote a multi-index a = (a,...,a@y) of integers > 0. Prove that 


[oe] 
ot! < (lo|)!, that 7), 4 = Se, and that 3° (9F")2! = (1 — z)-41 if |z| < 1. 
: ; 1=0 


Show that iterated substitution into the system D,;u = pe AjDju+ Bu+F 
leads to an expression for D/"u as the sum of two kinds of terms: For one kind, 
there are 2” terms of the form )> 7, --- Tj, D%u with each T; equal to an A j, OF 
to B, with D®% equal to the product of the Dj, for which T; = Aj,, and with the 
sum taken over j; from | to NV. For the other kind, there are ar 2S 2" = 4 
terms with something operating on F’,, the terms corresponding to s being the 
ones )° 7, --- 7; Di pets F with each 7;, the D®, and the sum all as above. 


(a) How does one compute pé Di"u(O, 0) from the expression in the previous 
problem? 
(b) Why is it enough to prove, for any givenr > 0, that the values Dé Di"u(0, 0) 
satisfy )° >> (B!m!)~!||D’ D™u(0, 0) ||, r4'*™ <o? 
m>0 B 


Choose a constant M > | with ||Bul|,, < M]lvul|,, and ||Ajul|,, < M|lvll,, for 

all 7. Let R be a positive number to be specified. Choose C = C(R) such 

that > > (B!m!)~!||D" DEF 0,0) || .R'I™ and > (B!)~!|DEgO)|| oR!" 
m>0 B B 

are both < C. 

(a) Among the 2” terms of the first kind in Problem 7, show that each one for 
which k of the m factors T;,..., Tm are Bis < M™N"-*CR-"*) (m—k)!, 
so that the sum of the contributions from the terms of the first kind to 
| Du (0, O)||5 is < P79 (2) M"N"-*CR-™S (im —k)!. 

(b) Taking into account the result of Problem 8a, adjust the estimate in part (a) 
of the present problem to bound the sum of the contributions from the terms 
of the first kind to || D?” Deu, O)Ileo- 

(c) Summing over m > 0,1 > 0, and 6 with |6| = / the estimate in part (b) and 
using the formulas in Problem 6, show that the contribution of the terms of 
the first kind to the series in Problem 8b is finite if R is chosen large enough 
so that Nr/R < 5 and 2MrN/R <1. 
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(d) For the 2” — | terms of the second kind in Problem 7, replace T; --- T; by 
T\ ---Tm—1, treating the missing factors as the identity 7, each such factor 
accompanying a differentiation D,. If there are k factors of B, show that 
the term is < M”—'(N 4+. 1)"-! *CR-@-!-% (m—1—k)!. Arguing in a 
fashion similar to the previous parts to this problem, show that consequently 
the contribution of the terms of the second kind to the series in Problem 8b is 
finite if R is chosen large enough so that Nr/R < 5 and2Mr(N+1)/R < 1. 


Problems 10-12 concern the reduction to a first-order system of the Cauchy problem 
for a single m'-order partial differential equation that has been solved for Du. They 
generalize the discussion of a second-order equation in two variables that appeared 
in Section | and reduce Theorems 7.3 and 7.4 to Theorems 7.1 and 7.2, respectively. 
In two variables (x, y), the equation is 


Diu = F(x, y; u; Dyu, Dyu; Du, ee D"™"'Dyu, re Dy u), 


and the Cauchy data are 


10. 


11. 


12. 


Diu, y)= f(y) for0 <i <m. 


In the case of two variables (x, y), introduce variables u’/ for i+ j < m. Show 
that the given Cauchy problem is equivalent to the following Cauchy problem 
for a first-order system 


Dub }*! = Dyul forit+j+l<m, 
D,ub® = yit10 for 0 <i <™m, 


Dw" = F, +0) Foo tu Fi0+(Dyu!) Fo. +.- -+(Dyul"—!) Fon 
with Cauchy data 


(0, y) = DIF) for i+ j<m, Gj) 4 (m,0), 
WO 9) =F O93 FOOT FOO) Def © Dseuie D2 f OO). 


What changes to the setup and argument in Problem 10 are needed to handle 
more variables, say (x, yj,..-, Yw_1)? 


Back in the situation of two variables (x, y) as in Problem 10, suppose that F 
is a linear combination, with constant coefficients, of wu, Dyu, Dyu,... , Diu, 
plus an entire holomorphic function of (x, y), and suppose that f©,... , f"~) 
are entire holomorphic functions of y. Prove that the reduction to first order as 
in Problem 10 leads to a Cauchy problem for a first-order system of the type in 
Problems 6-9. Conclude that the Cauchy problem for the given m'-order equa- 
tion in the situation of constant coefficients has an entire holomorphic solution. 


CHAPTER VIII 


Analysis on Manifolds 


Abstract. This chapter explains how the theory of pseudodifferential operators extends from open 
subsets of Euclidean space to smooth manifolds, and it gives examples to illustrate the usefulness of 
generalizing the theory in this way. 

Section 1 gives a brief introduction to differential calculus on smooth manifolds. The section 
defines smooth manifolds, smooth functions on them, tangent spaces to smooth manifolds, and 
differentials of smooth mappings between smooth manifolds, and it proves a version of the Inverse 
Function Theorem for manifolds. 

Section 2 extends the theory of smooth vector fields and integral curves from open subsets of 
Euclidean space to smooth manifolds. 

Section 3 develops a special kind of quotient space, called an “identification space,” suitable 
for constructing general smooth manifolds, vector bundles and fiber bundles, and covering spaces 
out of local data. In particular, smooth manifolds may be defined as identification spaces without 
knowledge of the global nature of the underlying topological space; the only problem is in addressing 
the Hausdorff property. 

Section 4 introduces vector bundles, including the tangent and cotangent bundles to a manifold. 
A vector bundle determines transition functions, and in turn the transition functions determine the 
vector bundle via the construction of the previous section. The manifold structures on the tangent 
and cotangent bundles are constructed in this way. 

Sections 5-8 concern pseudodifferential operators, including aspects of the theory useful in 
solving problems in other areas of mathematics. The emphasis is on operators on scalar-valued 
functions. Section 5 introduces spaces of smooth functions and their topologies, and it defines 
spaces of distributions; the theory has to compensate for the lack of a canonical underlying measure 
on the manifold, hence for the lack of a canonical way to view a smooth function as a distribution. 
Section 5 goes on to study linear partial differential equations on the manifold; although the symbol of 
the differential operator is not meaningful, the principal symbol is intrinsically defined as a function 
on the cotangent bundle. The introduction of pseudodifferential operators on smooth manifolds 
requires new results for the theory in Euclidean space beyond what is in Chapter VII. Section 6 
addresses this matter. A notion of transpose is needed, and it is necessary to understand the effect of 
diffeomorphisms on Euclidean pseudodifferential operators. To handle these questions, it is useful 
to enlarge the definition of pseudodifferential operator for Euclidean space and to redo the Euclidean 
theory from the new point of view. Once that program has been carried out, Section 7 patches 
together pseudodifferential operators in Euclidean space to obtain pseudodifferential operators on 
smooth separable manifolds. The notions of pseudolocal, properly supported, composition, and 
elliptic extend, and the theorems are what one might expect from the Euclidean theory. Again the 
principal symbol is well defined as a function on the cotangent bundle. Section 8 contains remarks 
about extending the theory to handle operators carrying sections of one vector bundle to sections of 
another vector bundle, about some other continuations of the theory, and about applications outside 
real analysis. The section concludes with some bibliographical material. 
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The goal of this chapter is to explain how aspects of the subject of linear partial 
differential equations extend from open subsets of Euclidean space to smooth 
manifolds. After an introduction to manifolds and their differential calculus, 
we shall see the extent to which definitions and theorems about distributions, 
differential operators, and pseudodifferential operators carry over from local facts 
about Euclidean space to global facts about smooth manifolds. We shall see 
also how certain important systems of differential equations can conveniently be 
expressed globally in terms of operators from one vector bundle to another. 

The present section introduces smooth manifolds, smooth functions on them, 
tangent spaces to smooth manifolds, differentials of smooth mappings between 
smooth manifolds, and a version of the Inverse Function Theorem for manifolds. 

We begin with the definition of smooth manifold. Let M be a Hausdorff 
topological space, and fix an integer n > 0. A chart on M of dimension n is a 
homeomorphism « : M, — M, of an open subset M, of M onto an open subset 
M,,. of IR"; the chart « is said to be about a point p in M if p is in the domain 
M, of k. We say that M is a manifold if there is an integer n > O such that each 
point of M has a chart of dimension n about it. 

A smooth structure of dimension n on a manifold M is a family F of 
n-dimensional charts with the following three properties: 


(i) any two charts « and x’ in F are smoothly compatible in the sense that 
«'ox~!,as a mapping of the open subset «(M,.7 M,’) of R” to the open 
subset «’(M,. 1 M,;) of IR", is smooth and has a smooth inverse, 

(ii) the system of compatible charts « is an atlas in the sense that the domains 
M, together cover M, 
(iii) F is maximal among families of compatible charts on M. 


A smooth manifold of dimension n is a manifold together with a smooth structure 
of dimension n. In the presence of an understood atlas, a chart will be said to be 
compatible if it is compatible with all the members of the atlas. 

Once we have an atlas of compatible n-dimensional charts for a manifold M, 
1.e., once (i) and (ii) are satisfied, then the family of all compatible charts satisfies 
(i) and (iii), as well as (ii), and therefore is a smooth structure. In other words, an 
atlas determines one and only one smooth structure. Thus, as a practical matter, 
we can construct a smooth structure for a manifold by finding an atlas satisfying 
(i) and (ii), and the extension of the atlas for (iii) to hold is automatic. 

Let us make some remarks about the topology of manifolds. Let M be any 
manifold, let p be in M, and let k : M, — M, be a chart about p. Then M, 
is an open neighborhood of «(p). Since R” is locally compact, we can find a 
compact subneighborhood N of « (p) contained in M,. Then «~!(N) is acompact 
neighborhood of p in M, and it follows that M is locally compact. Since M is 
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by assumption Hausdorff, M is topologically regular. By the Urysohn Metriza- 
tion Theorem! a separable Hausdorff regular space is metrizable; therefore the 
topology of a manifold is given by a metric if the manifold is separable.” 

We shall not assume at any stage that M is connected, and until Section 5 we 
shall not assume that M is separable. 

A simple example of a smooth manifold is R” itself, with an atlas consisting of 
the single chart 1, where 1 is the identity function on R”. Another simple example 
is any nonempty open subset F of asmooth manifold M , which becomes a smooth 
manifold by taking all the compatible charts « of M, replacing them by charts 
Kl yng? and eliminating redundancies. In particular, any open subset of R” 
becomes a smooth manifold since R” itself is a smooth manifold. 

Two less-trivial classes of examples are spheres and real projective spaces. 
They can be realized explicitly as metric spaces, and then one can specify an atlas 
and hence a smooth structure in each case. The details of these examples are 
discussed in Problems 1-2 at the end of the chapter. 

Most manifolds, however, are constructed globally out of other manifolds or 
are pieced together from local data. The Hausdorff condition usually has to be 
checked, is often subtle, and is always important. We postpone a discussion of 
this matter for the moment. 

Let us consider functions on smooth manifolds. If p is a point of the smooth 
n-dimensional manifold M, a compatible chart « about p can be viewed as giving 
a local coordinate system near p. Specifically if the Euclidean coordinates in 
M, are (uj,...,Un),theng =Kk~!(uy,..., Up) is a general point of M,., and we 
define n real-valued functions gq +> x;(q) on M, by xj(¢q) =uj,1 < j <n. 
Then k = (x1,...,X,). To refer the functions x; to Euclidean space R”", we use 
x; 0 «|, which carries (u1,..., Un) to u;. 

The way that the functions x; are referred to Euclidean space mirrors how 
a more general scalar-valued function on an open subset of M may be referred 
to Euclidean space, and then we can define the function to be smooth if it is 
smooth in the sense of Euclidean differential calculus when referred to Euclidean 
space. It will only occasionally be important whether our scalar-valued functions 
are real-valued or complex-valued. Accordingly, we shall follow the convention 
introduced in Chapter IV that F denotes the field of scalars, either R or C; either 
field is allowed (consistently throughout) unless some statement is made to the 
contrary. 

Therefore a smooth function f : E — F on an open subset E of M isa 
function with the property, for each p € E and each compatible chart « about p, 


'Theorem 10.45 of Basic. 
Some equivalent conditions for separability of a smooth manifold are given in Problem 3 at the 
end of the chapter. 


324 VU. Analysis on Manifolds 


that f ox! is smooth as a function from the open subset «(M, 1 E) of R” into 
F. A smooth function is necessarily continuous. 

In verifying that a scalar-valued function f on an open subset E of M is 
smooth, it is sufficient, with each point in E,, to check a condition for only one 
compatible chart about that point. The reason is the compatibility of the charts: 
if «; and kz are two compatible charts about p, then f ok; ! is the composition 
of the smooth function k; 0 Ky ' followed by fok, 

The space of smooth scalar-valued functions on the open set F will be denoted 
by C™(E); if we want to insist on a particular field of scalars, we write C®(E, R) 
or C*(E, C). The space C®(E) is an associative algebra under the pointwise 
operations, and it contains the constants. The support of a scalar-valued function 
is, as always, the closure of the set where the function is nonzero. We write 
Cx (E) for the subset of C°(E) of functions whose support is a compact subset 
of E. The space CX, (E), as well as the larger space C™ (£), separates points of 
E as a consequence of the following lemma and proposition; the lemma makes 
essential use of the fact that the manifold is Hausdorff. 


Lemma 8.1. If M is a smooth manifold, « is a compatible chart for M, and f 
is a function in C.(M,.), then the function F' defined on M to equal f on M, 


com 


and to equal 0 off M, is in CgS,,(M) and has support contained in M,. 


com 


PROOF. The set S = support(f) is a compact subset of M, and is compact 
as a subset of M since the inclusion of M, into M is continuous. Since M is 
Hausdorff, S is closed in M. The function F is smooth at all points of M,, and in 
particular at all points of S, and we need to prove that it is smooth at points of the 
complement U of S in M. If p is in U, we can find a compatible chart «’ about p 
with M, C U. The function F is 0 on M, 1 M, since U MN support(f) = @, and 
it isO on M, ME since it is 0 everywhere on M€¢. Therefore it is identically 0 
on M,.. and is exhibited as smooth in a neighborhood of p. Thus F is smooth. 


Proposition 8.2. Suppose that p is a point in a smooth manifold M, that « is 
a compatible chart about p, and that K is a compact subset of M, containing p. 
Then there is a smooth function f : M — R with compact support contained in 
M,. such that f has values in [0, 1] and f is identically 1 on K. 


PROOF. The set «(K) is a compact subset of the open subset M, = x(M,) of 
Euclidean space, and Proposition 3.5f produces a smooth function g in Coo. (M,.) 
with values in [0, 1] that is identically 1 on «(K). If f is defined to be g ok on 
M,, then f isin CS (M,). Extending f to be 0 on the complement of M, in M 


com 
and applying Lemma 8.1, we see that the extended / satisfies the conditions of 


the proposition. 
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EXAMPLE. This example shows what can go wrong if the Hausdorff condition 
is dropped from the definition of smooth manifold. Let X be the disjoint union 
of two copies of R, say (R, +) and (R, —), with each of them open in X. Define 
an equivalence relation on X by requiring that every point be equivalent to itself 
and also that (x, +) be equivalent to (x, —) for x 4 0. The quotient space M of 
X by this equivalence relation consists of the nonzero elements of one copy of R, 
together with two versions of 0, which we denote by 0+ and 0~. The topological 
space M is not Hausdorff since 0t and 0~ cannot be separated by disjoint open 
sets. Let Rt C M be the image of (IR, +) under the quotient map, and define 
R~ similarly. Define «+ : Rt > R! and«~ : R~ > R! in the natural way, and 
then «* and «~ together behave like an atlas of compatible charts covering M. 
To proceed with a theory, it is essential to be able to separate points by smooth 
functions. Smooth functions are in particular continuous, and 0* and O~ cannot 
be separated by continuous real-valued functions on M. Thus they cannot be 
separated by smooth functions, and Proposition 8.2 must fail. It is instructive, 
however, to see just exactly how it does fail. In the proposition let us take p = 07, 
« =«*,and K = {0*}. We can certainly construct a smooth function f on Rt 
with values in [0, 1] that is 1 on K = {0+} and has compact support L as a 
subset of R+. However, L is not closed as a subset of M. When f is extended to 
be 0 off R*, the extended function is not continuous, much less smooth. To be 
continuous, it would have to be defined to be 1, rather than 0, at O-. 


Corollary 8.3. Let p be a point of a smooth manifold M, let U be an open 
neighborhood of p, and let f be in C(U). Then there is a function g in CC® (M) 
such that g = f ina neighborhood of p. 


PROOF. Possibly by shrinking U, we may assume that U is the domain of some 
compatible chart « about p. Let K be a compact neighborhood of p contained in 
U, and use Proposition 8.2 to find h in C°(M) with compact support in U such 
that h/ is identically 1 on K. Define g to be the pointwise product hf on U and to 
be 0 off U. Then g equals f on the neighborhood K of p, and Lemma 8.1 shows 
that g is everywhere smooth. 


The Euclidean chain rule yields a necessary condition for a tuple of real- 
valued functions to provide a local coordinate system near a point, and the Inverse 
Function Theorem shows the sufficiency of the condition. The details are as in 
Proposition 8.4 below. Further results of this kind appear in Problems 6-7 at the 
end of the chapter. 


Proposition 8.4. Let M be an n-dimensional smooth manifold, let p be in M, 
let x be achart about p, and let f},..., fin beinC™(M,, R). In order for there to 
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exist an open neighborhood V of p such that the restriction of k’ = (fi, ..-, fin) 
to V is a compatible chart, it is necessary and sufficient that 


(a) m =n and 


(fi 0K!) 
(b) det } —————_ | ¥ O at the point u = k(:). 
Ou; 
PROOF OF NECESSITY. Let x’ = (fi, ..-, fim). If x’ is a compatible chart about 


p when restricted to some neighborhood V of p, then k’o k~! and x ok’~! are 
smooth mappings on open sets in Euclidean space that are inverse to each other. 
By the chain rule the products of their Jacobian matrices in the two orders are the 
identity matrices of the appropriate size. Therefore m = n, and the determinant 
of the Jacobian matrix of k’ 0 x~! at «(p) is not 0. 


PROOF OF SUFFICIENCY. Let m = n. If (b) holds, then the Inverse Function 
Theorem produces an open neighborhood V’ of «'(p) and an open neighborhood 
U' © M, of k(p) such that x’ o x~! has a smooth inverse g mapping V’ one-one 
onto U’. Let V = x~!(U’), and define h = «~!og. Thenh maps V’ one-one onto 
V and satisfies hok’ = ho(k’ok!)ok =k7!o(go(k’ok7!))oxk =K7! 
Thus h = x’—! and x'| . is achart. To see that the chart «|, is compatible, let 
«" be a chart in the given atlas such that VN M,n #4 @. Then k’ox’~! = 
(k’ok7!) 0 (k ok”~!) is smooth, and soisk” ok’! =k" oh =(k"oK!)og. 
Hence the chart x’ | y 1s compatible. 


ok =1. 


A smooth function F : E — N from an open subset E of the n-dimensional 
smooth manifold M into a smooth k-dimensional manifold N is a continuous 
function with the property that for each p € E,each compatible M chart « about 
p,and each compatible N chart x’ about F (p), the function x’ o F ox~! is smooth 
from an open neighborhood of «(p) ink«(M, NE) © R” into IR‘. The function 
k' o F ox! is what F becomes when it is referred to Euclidean space. Let us 
examine x’ o F ox~! further. 

In a compatible M chart « about p, we have used (uw), ...,U,) as Euclidean 
coordinates within M/,, and the local coordinate functions on M, are the members 
x; of C°(M,, R) such that x; © Kk '(uq,...,Un) = uj. Ina compatible N chart 
«’ about F(p), let us use (v1, ..., vg) as Euclidean coordinates within Nes and 
let us denote the local coordinate functions on N,, by y;. The formula for y; is 
y, ok’—!(vy,..., Ug) = v;. The function x’ o F o x7! takes values of the form 
(v,,..., Ug), and the way to extract the 7 ‘h coordinate function of k’o Fo x7! 
is to follow it with y; o «'—!. Thus when F is referred to Euclidean space, the 
i coordinate function of the result is yj o F ox~!. We shall write F; for this 
coordinate function. 

If F : M —> N is asmooth function between smooth manifolds and if F has 
a smooth inverse, then F is called a diffeomorphism. 
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If M and N are smooth manifolds, then the product M x N becomes a smooth 
manifold in a natural way by taking an atlas of M x N to consist of all products 
x x x’ of compatible charts of M by compatible charts of N. With this definition 
of smooth structure for M x N, the projections M x N ~ MandMxN—>WN 
are smooth and so are the inclusions M — M x {y} and N — {x} x N for any 
yin N andx in M. 

Fix a point p in M. The “tangent space” to M at p will be defined shortly in a 
way so as to consist of all first-derivative operators on functions at p. Traditionally 
one uses only real-valued functions in making the definition, but we shall adhere to 
our convention and allow scalars from either R or C except when we need to make 
a choice. Construction of the tangent space can be done in a concrete fashion, 
using the coordinate functions x;, or it can be done with a more abstract definition. 
The latter approach, which we follow, has the advantage of incorporating all the 
necessary analysis into the problem of sorting out the definition rather than into 
incorporating it into a version of the chain rule valid for manifolds. In other 
words the one result that will need proof will be a statement limiting the size of 
the tangent space, and the chain rule will become purely a formality. 

To the extent that a tangent vector at p is a first derivative operator at p, 
its effect will depend only on the behavior of functions in a neighborhood of p. 
Within the abstract approach, there are then two subapproaches. One subapproach 
works with functions on a fixed but arbitrary open set containing p and looks at 
a kind of first-derivative-at-p operation on them. The other subapproach works 
simultaneously with all functions such that any two of them coincide on some 
neighborhood of p. Either subapproach will work in our present context of 
smooth manifolds. It turns out, however, that a similar formalism applies to 
other kinds of manifolds—particularly to complex manifolds and to real-analytic 
manifolds— and only the second subapproach works for them. We shall therefore 
introduce the idea of the tangent space to M at p by working simultaneously with 
all functions such that any two of them coincide on some neighborhood of p. The 
operative notion is that of a “germ” at p. 

To emphasize domains, let us temporarily write (f,U) for a member of 
C™(U). We consider all such objects such that p lies in U ,and we define (f, U) to 
be equivalent to (g, V) if f = gonsome subneighborhood about p of the common 
domain UNV. This notion of “equivalent” is readily checked to be an equivalence 
relation, and we let C,, (M) be the set of equivalence classes. An equivalence class 
is called a germ of a smooth scalar-valued function at p. The set of germs inherits 
addition and multiplication from that for functions. Specifically the germ of the 
sum (f, U)+(g, V) is defined to be the germ of ((f | ,,y)+(8| yay) UNV). One 
has to check that this definition is independent of the choice of representatives, 
but that is routine. Multiplication is handled similarly. Then one checks that the 
operations on germs have the usual properties of an associative algebra over F. 
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Let us sketch the argument for associativity of addition. Let three germs be given, 
and let (f, U),(g, V), and (h, W) be representatives. A representative of the sum 
of the three is defined on the intersection / = UNV MW. On J, the restrictions 
to J satisfy (f + g) +h = f + (g +A) because of associativity for addition of 
functions; hence the germs of the two sides of the associativity formula are equal, 
and addition is associative in C,(M). 

The algebra C,(M) admits a distinguished linear function into the field of 
scalars F, namely evaluation at p. Ifa germ is given and (f, U) is arepresentative, 
then the value f(p) at p is certainly independent of the choice of representative; 
thus evaluation at p is well defined on C,(M). We denote it by e. Although germs 
are not functions, we often use the same symbol for a germ as for a representative 
function in order to remind ourselves how germs behave. A derivation of C, (M) 
is a linear function L : C,(M) — F such that L( fg) = L(f)e(g)+e(f)L(g). If 
the germ f is the class of a function (f, U), then we can define L on the function 
to be equal to L on the germ, and the formula for L on a product of two functions 
will be valid on the common domain of the two representative functions. 

Any derivation L of C,(M) has to satisfy L(1) = LU. -1) = L(1)1+1L() = 
2L(1) and thus must annihilate the constant functions and their germs. The 
derivations of C,(M) are also called tangent vectors to M at p, and the space of 
these derivations is called the tangent space to M at p and is denoted by T,(M). 

For M = R’”, evaluation of a first partial derivative at p is an example. More 
generally we can obtain examples for any M as follows: Let « be a compatible 
chart with p in M,. The specific derivations of C,(M) that we construct will 
depend on the choice of «. We obtain n examples Ear of derivations of C,(M), 
one for each j with 1 < j <n, by the definition 
Ex _ a(f ox!) a(f ox!) 

P 


Ox; Ou; fs (P)) = Ou; 


(1 5-225) =O (P)s-25%n(P)) 


For f = x;, we have 


[4] = a(x; 0K!) 
Pp 


Ou; 
5 (Ds nD) = EG (D) iia @)) = by. 
uj Ou; 


Ox; 


Consequently the n derivations [=, of C,(M) are linearly independent. 
J 
Proposition 8.5. Let M be a smooth manifold of dimension n, let p be in M, 
and let « be a compatible chart about p. Then the n derivations [ a iB of C,(M) 
‘i 


form a basis for the tangent space T,(M) of M at p, and any derivation L of 


Cp(M) satisfies 
= ) 
L= L(x;)} —]| . 
d, alae], 
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PROOF. We know that the 7 explicit derivations are linearly independent. To 
prove spanning, let L be a derivation of C,(M), and let (f, E) represent a member 
of C,(M). Without loss of generality, we may assume that E C M, and that «(E) 
is an open ball in R”. Put uo = (uo.1,...,U0.n) = K(p), let g be a variable point 
in E,and define u = (uw1,...,Un) = K(q). Taylor’s Theorem? applied to f ox! 
on K(E) gives 


fox") = fon "uo) + 2 (uj — uo,7) “LX wo) 
j=l 


+0 (Uj — u0,;) (uj — Uo, )) Rij) 
ij 
with Rj; in C°(«(E)). Referring this formula to M, we obtain 


f= fe) + ¥ «@— ys], 


i 


+> i (gQ) — xi (P)) jG) — xi (P))ri(G) 


ij 


on E, where rj; = Rij ok on E. Because L annihilates constants and has the 
derivation property, application of L yields 


Li = Y Leplag], +E E@nea) — x(ryecry) 
= ij 
+ (e(xj)—xi(p))L@eris) + (ei) —xi (p)) (E(x) — x; (p)) Li) 


n 
i=l : 
as asserted. 


A smooth function F : E — WN as above has a “differential” that carries the 
tangent space to M at p linearly to the tangent space to N at F(p). We shall 
define the differential, find its matrix relative to local coordinates, and establish 
a version of the chain rule for smooth manifolds. Let L be in 7,(M), and 
let g be in Cr(p)(M). Regard g as a smooth function defined on some open 
neighborhood of F (p), and define (d F’),(L) to be the member of Ty») (N) given 
by (dF ),)(L)(g) = L(g o F). To see that (dF’),(L) is indeed in Tr(p)(N), we 
need to check that L(g o F) depends only on the germ of g and not on the choice 
of representative function; also we need to check the derivation property. 


3In the form of Theorem 3.11 of Basic. 
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To check these things, let g and g* be functions representing the same germ at 
F(p). Then g = g* ina neighborhood of F'(p), and the continuity of F ensures 
that g o F = g* o F ina neighborhood of p. The derivation L depends only 
on a germ at p, and thus (dF),(L)(g) depends only on the germ of g. For the 
derivation property we have 


(dF ),(L)(g1g2) = L((gig2) 0 F) = L((g1 0 F)(g2 0 F)) 
= L(g10 F)(g2(F (p))) + (g1(F' (p)))L(g2 0 F) 
= (dF ),(L)(g1)(g2(F (p))) + (91 F (p))) (dF) p(L)(g2), 


and thus (dF ),(L) is in Trip) (N). 

The mapping (dF), : Tp)(M) — Trip) (N) is evidently linear, and it is called 
the differential of F at p. We may write dF, for it if there is no ambiguity; later 
we shall denote it by dF (p) as well. Proposition 8.5 gives us bases of T,(M) 
and T-(p)(N), and we shall determine the matrix of dF’, relative to these bases. 


Proposition 8.6. Let M@ and N be smooth manifolds of respective dimensions 
n and k, and let F : M — N be a smooth function. Fix p in M, let x be an 
M chart about p, and let «’ be an N chart about F(p). Relative to the bases 


) a 
|—| of T,(M) and || of Tr(p) (NV), the matrix of the linear function 
OXx;4p dy; IF(p) 


F; 


7a 
dF, : Ty(M) > Trip) (N) is Fe 
J 


REMARK. In other words it is the Jacobian matrix of the set of coordinate 
functions of the function obtained by referring F to Euclidean space. Hence the 
differential is the object for smooth manifolds that generalizes the multivariable 
derivative for Euclidean space. Accordingly, let us make the definition 


babel 
Ox; p Ou; 
PROOF. Application of the definitions gives 
0 0 
Fellas |,)0 = Lag] O° * 
”\ Lae Ip (yi) Te AC, 0 F) 


_ 00,0 F ox") 


(U1. Un)=(a1(p) gees Xn(p)) 


(x1 (p), ++) Xn(p)) 


__ OF; 


a Ou; 
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The formula in Proposition 8.5 allows us to express any member of Tr,,)(N) in 
terms of its values on the local coordinate functions y;, and therefore 


in((2])-o¥ ear 


(ee. BOG), ety 87? 
Thus the matrix is as asserted. 


Proposition 8.7 (chain rule). Let M, N, and R be smooth manifolds, and let 
F:M-— NandG: WN — R be smooth functions. If p is in M, then 


d(GoF), =dGrip) odFy. 
PROOF. If L is in T,(M) and h is in Cg (p))(R), then the definitions give 
d(Go F),(L)(h) = Liho Go F) =dF,(L)(ho G) = dGpFip)(dFp(L)(A)), 


as asserted. 


2. Vector Fields and Integral Curves 


A vector field on an open subset U of R” was defined in Chapter IV of Basic 
as a function X : U — R". The vector field is smooth if X is a smooth 
function. In classical notation, X is written X = pe Gj(X1, +++ 5 Xn) 3, and 
the function carries (x],...,X,) to (a, (%1,...,%n),---,Qn(XI,..., X,)). The 
traditional geometric interpretation of X is to attach to each point p of U the vector 
X (p) as an arrow based at p. This interpretation is appropriate, for example, if X 
represents the velocity vector at each point in space of a time-independent fluid 
flow. 

Taking the interpretation with arrows into account and realizing that the use 
of arrows implicitly takes F = R, we see that an appropriate generalization in 
the case of a smooth manifold M is this: a vector field attaches to each p in Ma 
member of the tangent space 7,,(M). Let us make this definition more precise. 

If M is asmooth n-dimensional manifold, let 


T(M) = {((p, L) | p € MandL € T,(M)}, 


and let x : T(M) — M be the projection to the first coordinate. A vector field 
X onan open subset U of M is a function from U to T(M) such that z o X is the 
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identity on U; so X is indeed a function whose value at any point p is a tangent 
vector at p. The value of X at p will be written X,,. 

We shall be mostly interested in vector fields that are “smooth.” Ultimately 
this smoothness will be defined by making T (M) into a smooth manifold known 
as the tangent bundle of M. The local structure of this smooth manifold is easily 
accessible via Proposition 8.5. That proposition shows that having a chart x of M 
singles out an ordered basis of the tangent space at each point in M,.. Identifying 
all these tangent spaces with F” by means of this ordered basis, we obtain an 
identification of {(p,L) | p € M, and L € T,(M)} with M, x F” and hence 
with M, x F”. The result is a chart for T (M) that we shall include in our atlas. It 
will be fairly easy to see how these charts are to be patched together compatibly. 
The problem in obtaining the structure of a smooth manifold is in proving that 
T (M) is Hausdorff. Although the Hausdorff property may look evident at first 
glance, it perhaps looks equally evident for the example with Rt and R7 in 
the previous section, and there the Hausdorff property fails. Thus some care is 
appropriate. We shall study this matter more carefully in Section 3 and complete 
the construction of the smooth structure on the tangent bundle in Section 4. 

For now we shall proceed with a more utilitarian definition of smoothness of 
a vector field. A vector field X on M carries C°(U), for any open subset U of 
M, to a space of functions on M by the rule (Xf)(p) = X,(f). We say that the 
vector field X on M is smooth if Xf is in C°(U) whenever U is open in M and 
fisinc?(U). 


Proposition 8.8. Let X be a vector field on a smooth n-dimensional manifold 
M. If k = (%,..., Xn) is a compatible chart and if f is in C°(M,), then 


af 
Xf(p) =) = (P) (Xxi)(p)__ for p € Me. 
The vector field X is smooth if and only if Xx; is smooth for each coordinate 
function x; of each compatible chart on M. 


PROOF. The displayed formula is immediate from Proposition 8.5. To see that 
if X is smooth, then Xx; is smooth on M,, let g be a point of M,. and choose, by 
Proposition 8.2, a function g in C®(M) such that g = x; in a neighborhood of 
q. Then a p) = 6;; identically for p in that neighborhood of g. The displayed 
formula shows that Xg(p) = Xx;(p) for p in that neighborhood. Since Xg is 
smooth everywhere, Xx; must be smooth in that neighborhood of q. 

Conversely suppose that each Xx; issmooth. Let f be inC(M). Since a (—) 
means fox) anes and since f ox~! isin C™%(M,), the function pre 2L(p) 
isinC™(U). Since each Xx; isin CC® (M,,) by assumption, Xf | M, isinC™(M,). 
Then Xf must be C°(M) because the compatible chart « is arbitrary. 
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A smooth curve c(t) on the smooth manifold M is a smooth function c from 
an open interval of R! into M. The smooth curve c(t) is an integral curve for a 
smooth real-valued vector field X if X¢q@) = dc; (4) for all ¢ in the domain of c. 
Integral curves in open subsets of Euclidean space were discussed in Section IV.2 
of Basic. We shall now transform those results into results about integral curves 
on smooth manifolds. 

Let M be a smooth manifold of dimension n, let k = (x1,..., Xn) be acom- 


n 
patible chart, and let X = > aj (x) ge be the local expression from Proposition 
j=0 : 
8.8 for a smooth real-valued vector field X on M within M,, so that a; is in 
C~(M,,R). Let c(t) be a smooth curve on U. Define bj(y) = aj («~!(y)) for 
y ¢ M CR’, and let yt) = (1), .--, yn (4) = K(c()), So that y(t) is a 
smooth curve on M,.. Then we have 


Xen f = 2 lait) 5 dew 7 Y 


0 
@ oxo] 


OX; i=l Xj 
_vy of 
= LOOT | 


and 
d _d _d -1 
de(— Jif) = Tif 9 = Flow ono 


a ox! i " i 
Se hel a VEE Was 


The two left sides are equal for all f, i.e., c(t) is an integral curve for X on M, 
in M, if and only if the two right sides are equal for all f,i.e., y(t) satisfies 
dyj 


rae 0c, forl <j <n. 


The latter condition is the condition for y(t) to be an integral curve for the vector 
n ~ 
field > bj(y) on M, in R”. Applying Proposition 4.4 of Basic, which in turn 
JjJ= 
is an immediate consequence of the standard existence-uniqueness results for 
systems of ordinary differential equations, we obtain the following generalization 
to manifolds. 


Proposition 8.9. Let X be a smooth real-valued vector field on a smooth 
manifold M, and let p be in M. Then there exist an ¢ > O and an integral curve 
c(t) defined for —e < t < € such that c(0) = p. Any two integral curves c and 
d for X having c(0) = d(O) = p coincide on the intersection of their domains. 
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As in the Euclidean case, the interest is not only in Proposition 8.9 in isolation 
but also in what happens to the integral curves when X is part of a family of vector 
fields. 


Proposition 8.10. Let X 2, X be smooth real-valued vector fields on 
a smooth n-dimensional manifold M, and let p be in M. Let V be a bounded 
open neighborhood of 0 in R”. For 4 in V, put X, = ae d,;X. Then there 
exist an € > O and a system of integral curves c(t, 4), defined for t € (—é, €) 
and A € V, such that c(-, A) is an integral curve for X, with c(O, A) = p. Each 
curve c(t, A) is unique, and the function c : (—e, €) x V — M is smooth. If 
m =n, if the vectors X"(p),..., X(p) are linearly independent, and if 6 is 
any positive number less than e, then c(4, -) is a diffeomorphism from an open 
subneighborhood of 0 (depending on 4) onto an open subset of M, and its inverse 
defines a chart about p. 


PROOF. All but the last sentence is just a translation of Proposition 4.5 of 
Basic into the setting with manifolds. For the last sentence, Proposition 4.5 of 
Basic establishes that the the Jacobian matrix at A = 0 of the function A bb 
c(6, A) transferred to Euclidean space is nonsingular, and the rest follows from 
Proposition 8.4. 


3. Identification Spaces 


We saw in a 1-dimensional example in Section 1 that the Hausdorff condition 
is subtle (and does not always hold) when one tries to build a smooth manifold 
out of smooth charts. In Section 2 we saw that it would be desirable to obtain a 
smooth manifold structure on the tangent bundle of a smooth manifold in order to 
make the definition of smoothness of vector fields more evident from the smooth 
structure, and the natural way of proceeding was to piece the structure together 
from charts that were products of charts for the smooth manifold by mappings on 
whole Euclidean spaces. The example in Section 1 serves as a reminder, however, 
that we should not take the Hausdorff condition for granted in working with the 
tangent bundle. 

In fact, the construction in both instances appears in a number of important 
situations in mathematics. One is in constructing “vector bundles” and more 
general “fiber bundles” out of local data, and another is in constructing covering 
spaces in the theory of fundamental groups. Still a third is in the construction of 
restricted direct products* in Problem 30 in Chapter IV. 


“In fairness it should be said that restricted direct products, which involve a direct limit, are more 
easily handled by the method in Chapter IV than by a construction analogous to that of the tangent 
bundle. 
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For a clearer picture of what is happening, let us abstract the situation. The 
idea is to build complicated topological spaces out of simpler ones by piecing 
together local data. For lack of a better name for the abstract construction, we 
shall call the result an “identification space.’ A simple example of the use of 
charts in defining manifold structures will point the way to the general definition. 


EXAMPLE. Suppose, by way of being concrete, that we have overlapping open 
sets U; and U2 in R”. We take U; and U2 as completely understood, and we want 
to describe U; U U2 as a topological space. Let X be the disjoint union of U; 
and U2, which we write as X = U; U U2. By definition, X as a set is the set 
of all pairs (x, i) with x in U;, and i takes on the values 1 and 2. We identify 
U; © U; UU» with the set of pairs (x, 1) and U2 C U; U U2 with the set of 
pairs (y, 2). A subset E of X is defined to be open if E M U, is open in U; and 
E MU, is open in U2. The resulting collection of open sets is a topology for X, 
and the embedded copies of U; and U2 in X are open. We define (x, 1) ~ (y, 2) 
if x = y as members of R”, and the identification space is X/~. We give X/~ 
the quotient topology, and it is not hard to see that X /~ is homeomorphic to the 
union U; U U2 as a topological subspace of the metric space R”. 


Let us come to the general definition. We are given a set of topological spaces 
W; for i in some nonempty index set 7, and we assume, for each ordered pair 
(i, j), that we have a homeomorphism yj; of an open subset W;; of W; onto an 
open subset W;; of W; (possibly with W;; and W;; both empty) such that 

(i) Wij is the identity on Wi; = Wi, 
(ii) Wij o Wy is the identity on W;;, and 

Gil) Wig A Wii = Wij (Wig 1 Wij), and Wj 0 Wii = Wei on this set. 

We form the disjoint union X = |_|; Wij, i.e., the set of pairs (x, i) with x in W;. 
We topologize X by requiring that each W; be open in X. Then we introduce a 
relation ~ on X by saying that (x,i) ~ (y, j) if wji(x) = y. The three properties 
(i), (41), and (iii) show that ~ is an equivalence relation, and X/ ~ is called an 
identification space. It is given the quotient topology. 

Let us see the effect of this construction in the special case that we reconstruct 
a general smooth n-dimensional manifold out of an atlas of its charts. If x; is a 
chart in the atlas, we take W; to be the image M,, of «;. With two such charts x; 
and «;, define 


Wii = Ki (Mi, Me), Wig = (Me Mg), Wy = pow! 
It is a routine matter to check (i), (ii), and (iii). The disjoint union |_]; ee of 


the maps ee is a continuous open function from X = |]; W; onto M. Let 
q:X — X/~ be the quotient map. If (x,7) ~ (y, Jj), then wji(x) = y and 
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hence Kk; © : (x) = y and bas (x) = «i '(y). Thus equivalent points in X map 
to the same point in M, and we obtain a factorization |]; ee = gogq fora 
continuous open map g : X/~ — M. Since the only identifications in M are the 
ones determined by the charts, i.e., the ones of the form (x, 7) ~ (y, j) as above, 
gy is one-one and consequently is a homeomorphism. We can recover the charts 
of M as well, since the restriction of qtoa single W; is one-one. The i th chart is 
the function g~! og"! m,, > Mei > Mx: 

Thus an identification space is a suitable device for reconstructing a smooth 
manifold from its charts. We can therefore try to use identification spaces to 
build new smooth manifolds out of what ought to be their charts. Proposition 
8.11 below simplifies the checking of the Hausdorff condition. Proposition 8.12 
shows, under natural additional assumptions, that the identification space is a 
smooth manifold if it has been shown to be Hausdorff. 


Proposition 8.11. In the situation of an identification space formed from a 
disjoint union X = |_]; W; and an equivalence relation ~, the quotient mapping 
q : X — X/~ is necessarily open. Consequently the identification space X/~ 
is Hausdorff if and only if the set of equivalent pairs in X x X is closed. 


REMARKS. In applications we may expect that the given topological spaces 
W; are Hausdorff, and then their disjoint union X will be Hausdorff, and so will 
X x X. In this case the theory of nets becomes a handy tool for deciding whether 
the set of equivalent pairs within X x X is closed. Thus suppose we have nets with 
Xe ~ Yq in X and that x» > xo and yy —> yo. Weare to prove that x9 ~ yo. Let 
Xo be in W;, and let yo be in W;. Since W; and W; are open in X, X,q is eventually 
in W; and y, is eventually in W;. In other words, the Hausdorff condition depends 
on only two sets W; at a time and is as follows: We may assume that x, and xo 
are in W; with xy — xo, that y, and yo are in W; with yy > yo, and that xy ~ Yo 
for all a. What needs proof is that xp ~ yo. 


PROOF. The second statement follows from the first in view of Proposition 
10.40 of Basic. Thus we have only to show that the quotient map is open. 
If U is open in X, we are to show that g~'!(q(U)) is open in X. The direct 
image of a function respects arbitrary unions, and thus g(U) = U jg 1 Wj). 
Hence qt (q(U)) = U; qi (q(U M W;)), and it is enough to prove that a single 
q ‘(q(U 1 W;)) is open. Since X is the disjoint union of the open sets W,, it 
is enough to prove that each W; 9 qi(qUn Wj)) is open. This intersection 
is the subset of elements in W, that get identified with elements in UN W,, 
namely yj;(U 1 W;;). Since yj; is a homeomorphism of W;; with W;;, the set 
wij(U O W;;) is open in W;;. Since W;; is open in W;, yi;(U MN Wi) is open in 
Wj. 
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Proposition 8.12. Let the topological space M be obtained as an identification 
space from a disjoint union X = |_|; W; in which each W; is an open subset of 
IR”. Suppose that each identification yj; : Wj; — Wj; is a smooth function, 
and suppose that g : X — M denotes the quotient mapping. Assume that the 
set of equivalent pairs in X x X is a closed subset, so that M is a Hausdorff 
space. Then M becomes a smooth n-dimensional manifold under the following 
definition of an atlas of compatible charts: For eachi, let U; = q(W;), and define 
kK; : U; — W; to be the inverse of dw, : W; — U;. The charts of the atlas are 
the maps «;. 


PROOF. The mapping q is open according to Proposition 8.11. Since W; is 
open in X, U; = q(W;) is open in M. To see that g is one-one from W; to U;, 
suppose that two members of W; are equivalent. We know that the members of 
W; are of the form (w, i), and the equivalence relation is given by the statement 


(w;, i) a (w;, J) if and only if Wii (w;) = Uj. (*) 


In particular w; must be in the domain of w;;, which is W;;. Then two members 
of W;, say (w,i) and (w’,i), can be equivalent only if w;;(w) = w’. Since 
w;; is the identity function, w = w’. Therefore g is one-one on W; and is a 
homeomorphism of W; onto the open subset U; of M. Consequently x; is well 
defined as a homeomorphism of the open subset U; of M with the open subset 
W; of Euclidean space R”. 

We have to check the compatibility of the charts. We have 


U; NU; = q(Wi) Ng (Wj) 
= {classes of {¢(w;,i) | Wj; is defined on wi}} = q(Wiji). 
Then 
Ki (U; OU;) = Ki((q|y,) Wj) = Wii, 


and similarly «;(U; Uj) = W;;. Hence x; o me carries W;; onto W;;. If (wj, 7) 
is amember of W;;, we show that 


K(k, (wi. ))) = (Wyewi), J): (4%) 
If we drop the second entries of our pairs, which are present only to emphasize 
that X is a disjoint union, equation (>) says that K;0 Ke equals w;; on W;;. Since 
wji is smooth by assumption, the verification of (**) will therefore complete the 
proof of the proposition. Taking () into account, we have 


«| ((w;, 8) = @((wi, 8) = ai (wi), J) = «(Wii (wi), J): 


Application of «; to both sides of this identity yields (**) and thus completes the 
proof. 
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4. Vector Bundles 


In this section we introduce general vector bundles over a smooth manifold M. Of 
particular interest are the tangent and cotangent bundles. The tangent bundle as 
a set is to be identifiable with |) pem p(M), and one realization of the cotangent 
bundle as a set will be the same kind of union of the dual vector spaces T;(M ) 
to T,(M). To construct these bundles as manifolds, we shall form them as 
identification spaces in the sense of Section 3, and that step will be carried out in 
this section. 

We continue with the convention of writing F for the field of scalars, which is 
to be R or C. The fiber of any vector bundle will be F” for some n, and we speak 
of real and complex vector bundles in the two cases. 

Let M be a smooth manifold of dimension m, and let {x} be an atlas of 
compatible charts, where x is the map x : M, — M,. Denote by GL(n, F) 
the general linear group of all n-by-n nonsingular matrices with entries in F. A 
smooth coordinate vector bundle of rank 7 over M relative to this atlas consists 
of a smooth manifold B called the bundle space, a smooth mapping z of B 
onto M called the projection from the bundle space to the base space M, and 
diffeomorphisms ¢, : M, x F" — 27!(M,) called the coordinate functions 
such that 

G) téx(p, v) = p for p € M, andv € F”, 

(ii) the smooth maps ¢,,, : F" > —1(M,) ‘defined for pin M, by $x, »(v) = 
dx (p, v) are such that bo 1p dy,p : F" > F" is in GL(n, F) for each x 
and «’ and for all pin M,N Mw, 

(iii) the map gee: Me A Mg’ > GL(n, F) defined by ge (p) = Ge!) © be, p 

is smooth. 


The maps p +> x‘ (p) will be called the transition functions? of the coordinate 
vector bundle. 

An atlas of compatible charts of the coordinate vector bundle may be taken to 
consist of the maps (« x 1) og eee ae (1) 9 ees M, x F”. The dimension of B 
ism+nifF= R and ism +2nifF =C. 


EXAMPLE. Data for the tangent bundle. Although we have not yet introduced 
the topology on the bundle space in this instance, we can identify the functions ¢, , 
gy, and g,, explicitly. Let the local expressions for « and k’ be x = (x1, ..., Xn) 


cl dy 

and x’ = (yj,..., yn). Letc = ( : andd = ( : be members of F”. The 
dn 

set 2~'(M,,) is to consist of all tangent vectors at points of M,,, and Proposition 


Cn 


5The terms coordinate transformations and transition matrices are used also. 
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8.5 shows that these are all expressions pe al ip where [I], concretely 
3 . id 


a 
ay 
2 
means ofon (k(p)). The formulas for ¢, and ¢,7 are then 
J 


0 = Lol], 


and du',p(d) = > alas, 
j= 


The other relevant formula is the formula for the matrix of the differential of 
a smooth mapping relative to compatible charts in the domain and range. The 
formula is given in Proposition 8.6 and is 


4F (a ],) = GE La, 


We apply this formula with F equal to the identity mapping, whose local expres- 
sion is k’o x~! and therefore has F; = yj O x~!. Since the differential of the 
identity is the identity, we have 


n 
ayi a 
(4), = 5 121,L4],- 
Substituting into the formula for ¢,,»(c), we obtain 


oe x ( X ol7],) La, 


j=l 
Therefore Py Pep (C) = d, where d; = 2 ilar lp = ([#,¢). and we 
j= 
conclude that 
8x'x(P) = [52 I,- 


Returning to case of a general coordinate vector bundle, let us observe a simple 
property of the transition functions. 


Proposition 8.13. Let M be an m-dimensional smooth manifold M, fix an 
atlas {x} for M, and let 7 : B — M be a smooth vector bundle of rank n with 
transition functions p +> gy.(p). Then 


8x'"«' (P)8x'«(P) = Sere(P) — for all p € Me OV Mer 1 Mgr. 
Consequently the transition functions satisfy the identities ¢,,.(p) = 1 for p ¢ M, 
and gex'(P) = (Seep)! for p € Me ON My’ 
PROOF. We have gee (P)8'«(P) = be Pe’ Pe pPep = Pe Pep = 
cre (p). Putting k = «' =x" yields gicx(P)8cx(P) = Sex (p); thus ge (p) = 1. 
Putting « = x" yields gx (P)8«'« (P) = 8c«(p) = 1. 
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The main abstract result about vector bundles for our purposes will be a 
converse to Proposition 8.13, enabling us to construct a vector bundle from an 
atlas of M and asystem of smooth functions p + g,,(p) defined on M,N My if 
these functions satisfy the conditions of the proposition. This result will be given 
as Proposition 8.14 below. In the case of the tangent bundle, we saw above that 
8x'x (p) is given by Sen (Pp) = cae The identity gee (P) Sexe (P) = Sere (P) 
follows from the chain rule, and thus the abstract result will complete the con- 
struction of the tangent bundle as a smooth manifold. We shall construct the 
cotangent bundle similarly. 

One can equally construct other vector bundles of interest in analysis, as we 
shall see, but we shall omit the details for most of these. It is fairly clear from 
the example above that one can make local calculations with vector bundles by 
working with the transition functions. Here is an example. 


EXAMPLE. Suppose for a particular coordinate vector bundle that we have a 
system of functions f, : M,.xF” — S withrange equal to some set S independent 
of «. Let us determine the circumstances under which the system { f,.} is the local 
form of some globally defined function f : B — S. A necessary and sufficient 
condition is that whenever (x, v) € M, x F" and (y, v’) € My x FF" correspond 
to the same point of B, then f, (x, v) = fer (y, v’). The maps from M, x F” and 
M. x F” into B are ¢, 0 (k~! x 1) and do (k’~! x 1). Thus (x, v) and (y, v’) 
correspond to the same member of B if and only if ¢, (x—!x, v) = be (k’ly, v’). 
We must have «~!x = x’~y for this equality. In this case let us put p = «~'x = 
«'—ly, and then it is necessary and sufficient that d,p(V) = der,p(v’), hence 
that ep © dx,p(v) = v’, hence that g.~(p)(v) = v’. Thus (x, v) and (y, v’) 
correspond to the same point in B if and only if y = «’«~!x and ge (Kk! x)(v) = 
v’. Consequently the functions f, define a global f if and only if 


F(x, v) = fer (kk! X, Bere (Kk !x)(v)) 


whenever «’x~!x is defined. In the case of the tangent bundle, we saw in the 
previous example that 9, = [2]. Thus the condition is that 
J 


fer, v) = fey, [G2 JO)) 


whenever y = «’«~!(x); here the fiber dimension n is also the dimension of the 
base manifold M. 


Before getting to the converse result to Proposition 8.13, let us address the 
question of when, for givenn, F, M, B, and z, we get the “same” coordinate 
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vector bundle from a different but compatible atlas {A} and different coordinate 
functions ¢,. The condition that we impose, which is called strict equivalence, 
is that if we set up the transition functions corresponding to a member « of the 
first atlas and a member A of the second atlas, namely 


Ru(P) =F, bcp  forpe MAM, 


then each g,,,(p) lies in GL(n, F) and the function p + g),(p) is smooth from 
M,.0 M,, into GL(n, F). In other words, strict equivalence means that the union 
of the two atlases, along with the associated data, is to make 7 : B > M intoa 
coordinate vector bundle. Strict equivalence is certainly reflexive and symmetric. 
Since we can discard some charts from the construction of a coordinate vector 
bundle as long as the remaining charts cover M, strict equivalence is transitive. 
An equivalence class of strictly equivalent coordinate vector bundles is called a 
vector bundle, real or complex according as F is R or C. 

With the definition of smooth structure for a smooth manifold, we were able 
to make the atlas canonical by assuming that it was maximal. Every atlas of 
compatible charts could be extended to one and only one maximal such atlas, 
and therefore smooth manifolds were determined by specifying any atlas of 
compatible charts, not necessarily a maximal one. We do not have to address 
the corresponding question about vector bundles—whether the atlas of M used 
in defining a coordinate vector bundle can be enlarged to a maximal atlas of M 
and still define a coordinate vector bundle. The reason is that the specific vector 
bundles with which we work are all definable immediately by maximal atlases of 
M. 

Now let us proceed with the converse result. 


Proposition 8.14. If a smooth m-dimensional manifold M is given, along 
with an atlas {x} of compatible charts and a system of smooth functions 
Bx’? Me 1 My > GL(n, F) satisfying the property gx (DP) 8x (P) = 8x" (P) 
for all p in MA Me O Myr, then there exists a coordinate vector bundle 
xz : B — M with the functions g,, as transition functions. The result is 
unique in the following sense: if 7 : B — M and x’: B’ — M are two 
such coordinate vector bundles, with coordinate functions @, and ¢/,, then there 
exists a diffeomorphism h : B — B’ such that m’0h = 7 and ¢) =ho ¢, for 
all charts « in the atlas. 


PROOF OF UNIQUENESS OF COORDINATE VECTOR BUNDLE UP TO FUNCTION h. 
Define a diffeomorphism h, : 7~'(M,) > m'~!(M,) by hy = gy. 0 Oe so 
that h, od, = ¢).. Evaluating both sides at (p, F") with p in M,, we obtain 
h,(a~'(p)) = n'"!(p). Thus 2’ 0h, = on7!(M,). 

Since the map hy,» = he| 1p) carries 2~!(p) to ’~!(p), we can write 
Liew O Dig = O53 If p is also in M,-, then we have hyp 0 Pk’,p = Lp 
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as well. Since B and B’ are assumed to have the same transition functions, 
-1 -1 i 

Sux (P) = Pu',p Px,p = Dep ia in other words, Pu! p&x' (P) = P«,p and 

Pi, p&k'x (P) = i, p- Therefore 


he, pPx, p = br, p =, be", pSx'x (P) = he, pPxt, p&x'x (DP) = he pPx,ps 
and we obtain hy,» = h,’,). Thus the functions h, are consistently defined on 
their common domains and fit together as a global diffeomorphism of B onto B’. 


PROOF OF EXISTENCE OF COORDINATE VECTOR BUNDLE. Let us construct 
B as an identification space. We are writing M, for «(M,), and we put 
Men = K(M,..AM,). Define W, = M, x F" and Wen = Myre x F”", and 
let 


Wn (M, v) = (kK (M), Belk (M))(v)) for (MM, v) € Wee. 


We shall prove that X = |_|, W,., together with the functions y,:,, defines an 
identification space B = X/~. We have to check (i), (ii), and (iii) in Section 3. 
For (i), we need that w,,,. is the identity on W,,, = W,., and the computation is 


Wer (M, v) = (M, Bee(k '(M))(v)) = (mM, v) 


since g,(-) is identically the identity matrix. For (ii), we need that Wr We, 1S 
the identity on W,.,,. The composition on (7, v) is 
Wien (kk (i), Sere(e | (M))(v)) 
= (KKK eM), Secu ((k'K| (M))) Bere (| (M))(V)) 
= (iM, Seu (k'(M)) Bere (kK | Ht) (0). 
The second member of the right side collapses to v since gx (P) 8x'x(p) = 1 for 
all p in M,,. This proves (ii). For (ii), we need that Wren © Were = Were on the 
set Wri OV Were = Wret Were O Wee), and the composition on (m, v) 
= Were (CK A), Bere (K | (M)(v)) 
= (KKK), Bere (K(k) See (| ))(V)) 
= («Keo (M), Sere (KM) Sere (Ke | (M))(V)) 
= («Ke Gt), See (k | ))(v)) 
= Wien (M, v). 


This proves (iii) and completes the construction of B. 
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To prove that B is Hausdorff, we apply Proposition 8.11 and its remark. Thus 
suppose that we have nets with xy ~ yy in X, that x» > xo and yy — yo, and 
that x, and xo are in W,. and yy. and yo are in W,.. We are to prove that x9 ~ yo. 
Write xy = (Ma, Va), X0 = (Mo, Vo), Yo = (M',, vi), and yo = (mp, uy). The 
assumed convergence says that ig > m0, Ue > vo,m/, > my, and vi, > vp. 
The assumed equivalence xg ~ yo says that Wen (Me, Va) = (m',, v),), ie., 


Kk (Mig) =m, — and — gre (kK! (Mtg) (Va) = Vi, 


and we are to prove that 
-ly~ ~ -ly~ 
k'k (9) = Mo and Bei (K (M9))(vo) = Up. 


The functions k’k~!, g,¢%, and «~! are continuous, and the only question is 
whether jig is in the domain of «’«~! and «~! (io) is in the domain of gy, i.e., 
whether 77g is in the subset M,,. = «(MM M,:) of M, = «(M,,). Assume the 
contrary. Then mg is on the boundary of «(M,M M,) in «(M,,) and mp is on 
the boundary of «’/(M, 9 M,-) in k'(M,’). So «~'(mo) is on the boundary of 
M1 M,: in M,, and x’! (mp) is on the boundary of M, 9 M, in M,:. This 
implies that «~'(mo) is in M, but not M,: while «’~!(m}) is in My’ but not My. 
Consequently «~!(mo) 4 «’~!(mj). Since M is Hausdorff, we can find disjoint 
open neighborhoods V and V’ of «~!(i79) and «’~!(m75) in M. Since «~! is 
continuous, «~!(77,) is eventually in V; since «’~! is continuous, «''@,) is 
eventually in V’. Then we cannot have x~!(m,) = «'—lam!) eventually, hence 
cannot have k’k~! (iy) = m’, eventually, contradiction. We conclude that B is 
Hausdorff. 

To complete the proof, we exhibit B as a coordinate vector bundle. Let 
q : X — B be the quotient map. Application of Proposition 8.12 produces a 
manifold structure on B, the charts being of the form «* = (q | Pe with domain 


q(W,.). If p, denotes the projection of W,. on M,,we define z : gq(W,.) — M to 
be the composition «~! p,«*. To have 1 : B + M be globally defined, we have 
to check consistency from chart to chart. Thus suppose that b = q(w,) = q(wx’) 
with w, = (M,, Ve) in W, and wy = (My, Ver) in Wy. We are to check that 
Ko! pe (We) = KW! pe (we), hence that k~!(m,.) = x'~"!(m,"). The condition 
q(we) = g(we’) means that w,. ~ w_, Which means that Wy~. (We) = wi, and 
therefore that (k’k7 (me), Belk (k—! (Mx)) (Ve) = (M,, Vg). Examining the first 
entries shows that «~!(m,) = x’~!(mi,’). Therefore z is well defined. 

The diffeomorphism ¢, : M, x F” > a !(M,) is given by ¢ =qo(k x1). 
If p isin M,N M,, write vo = Pei) Gx,p))- Then ¢,:p(v') = dx, p(v), and 
hence q(x'(p), v’) = q(«(p), v). Thus («'(p), v') ~ (k(p), v), and 


(k'(p), V) = Welk (p), v) = (k’« (kK (p)), Bee (K(k (p)))(0)). 
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Examining the equality of the second coordinates, we see that v! = gx. (p)(v). 
Therefore ¢,, : © Pk,p = &x'x(p), and the transition functions match the given 
functions. This completes the proof. 


As we mentioned after Proposition 8.13, Proposition 8.14 enables us to in- 
troduce the structure of a vector bundle on the tangent bundle T (/), since the 
product formula for the transition functions g,.(p) = [s1, follows from the 
chain rule. The transition functions g,.(p) = Ear are real-valued and thus can 
be regarded as in GL(n, R) or GL(n, C). Thus T (M), in our construction, can be 
regarded as having fiber R” or C”, whichever is more convenient in a particular 
context. We can speak of the real tangent bundle 7 (M, R) and the complex 
tangent bundle 7 (M, C) in the two cases.° 

We shall make use also of the cotangent bundle 7*(), and again we shall 
allow this to be real or complex. Members of the cotangent bundle will be called 
cotangent vectors. We give two slightly different realizations of T*(M), one 
starting from T(M) as the object of primary interest and the other proceeding 
directly to T*(M). In both cases, T*(M) and T (M) will be fiber-by-fiber duals of 
one another, and the transition functions will be transpose inverses of one another. 

For the first construction we shall identify the dual of 7,(M) in terms of 
differentials as defined in Section 1. Let M be n-dimensional, let « be a compatible 
chart about p, and let f € C°(U) be a smooth function in a neighborhood of 
p. By definition from Section 1, the differential (df), is the linear function 
(df)p : Tp(M) > Typ) given by 


(df)p(L)(g) = L(go f). 


Let us take gg : F > F to be the function go(t) = t. Since 
fol ],(g0) = “2? (p) = 6(F (p)) FL?) = FLOP). 


we see that (df),(L)(go0) = Lf forall L in T,(M). In particular, each differential 
(df), acts as a linear functional on T,(M). Moreover, the elements (dx;),, 
namely the differentials for f = x,;, are the members of the dual basis to the basis 
[ely of T,,(M), and we can use them to write 


fp => 3£(p) (dxi)p. 


We postpone a discussion of the bundle structure on T*(M) until after the second 
construction. 


©Traditionally the words “tangent bundle” refer to what is being called the real tangent bundle, 
and the traditional notation for it is T(M). 
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For the second construction we use the algebra C, of germs at p. Evaluation 
at p is well defined on germs at p, and we let C . be the vector subspace of germs 
whose value at p is 0. Inside ce , we wish to identify the vector subspace Cs of 
germs that vanish at least to second order at p. These are’ germs of functions f 
with the property that | f (¢) — f (p)| is dominated by a multiple of |«(g) —k (p)|? 
in any chart « about p when gq is ina sufficiently small neighborhood of p. 

Within the second construction the cotangent space T(M) is defined as the 
vector space quotient C 7 /C - . To introduce a vector-bundle structure on T*(M) = 
a) p Ly (M) by means of Proposition 8.14, we need to set up the local expression 
for a member of the cotangent space and understand how it changes when we 
pass from one compatible chart « to another x’. We begin by observing for any 
open neighborhood U of p that there is a well-defined linear map f t df (p) of 
C™(U) onto T;'(M) given by passing from f to f — f(p) in ce and then to the 
coset representative of f — f(p) in T7(M) = Cie. 


Proposition 8.15. Let M be a smooth manifold of dimension n, let p be in M, 
and let k = (x1,...,X,) be a compatible chart about p. In either construction 
of i (M), the n quantities dx;(p) form a vector-space basis of ss (M), and any 
smooth function f defined in a neighborhood of p has 


a) 
df(p)=>- Tp) dsi(p). 
i=1 f 


PROOF. We have already obtained this formula for the first construction. For 
the second construction, we observe as in the proof of Proposition 8.5 that Taylor’s 
Theorem yields an expansion for f in the chart « about p as 


f@=fwt¥ ig — xy) L(p) 
i=1 
+ Dig) — (Pj) — x)(D) rij) 


ij 


from which we obtain 


df(p) = 3£(p) dxi(p). 


i= 


This establishes the asserted expansion and shows that the dx; (p) span the vector 
space T;*(M). For the linear independence suppose that ye cid xi (p) = 0 with 


TIf we allow ourselves to peek momentarily at the tangent space, we see that C s is the subspace 
of all members of C y on which all tangent vectors at p vanish. 
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the constants c; not all 0. If we define f = year c;x; in M,, then computation 
gives ot (p) = c; and hence df(p) = -j_, ¢; dxi(p) = 0. Thus f — f(p) 
vanishes at least to order 2 at p. Since f — f(p) is linear, we conclude that 
J —f (@) vanishes identically near p. Hence all coefficients c; are 0. This proves 
the linear independence. 


When p moves within the compatible chart «, we can express all members of 


n 
the spaces sO (M) for q in that neighborhood as )° &;(q¢) dx; (q), but the functions 


i= 
&;(q) need not always be of the form af (q) for a single function f. Nevertheless, 
we can use the transformation properties of df (p) for special f’s to introduce a 
natural vector-bundle structure on T*(M) by means of Proposition 8.14. 


EXAMPLE. Direct construction of bundle structure on cotangent bundle. Con- 
tinuing with the direct analysis of T*(M), let us form the coordinate functions and 
charts. Define T*(M,) = U peu: sis (M). Using Proposition 8.15, we associate 
to a member (p, €) of T*(M,,) the coordinates 


(x1(p), 22+ Xn(p)s &, weey.Sn)i 


where k(p) = (x1(p),..-,%n(p)) and € = }° &; dx;(p). The coordinate func- 
i=l 


tion ¢, is given in this notation as a composition carrying (p; &,...,&,) first 
to (x1 (Pp), ...,Xn(p); &1,..., &,) and then to Dé dx;(p). That is, 
(P81. --- 8) = YG dzi(0). 
If p lies in another chart x’ = (y1,..., Yn), then we similarly have 
Pe! (D3 M14 ++ +s Mn) = 2 ni yi (Pp). 
The formula of Proposition 8.15 shows that 
dxi(p) => 28 (p) dy,(p). 


Therefore 


n 


be(Pi Bis --vb) = Yo dxi(P) =D (LHW) doy). 
i= J= 


i=l 
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and F ‘ 
O5 be (Bi En) =(p: VERO)... VEZ), 
i=1 i=l 
In other words, 


bo bc (D3 Ei, +0 En) = (P31, +-+5Mn) 


n 
with nj = >> & 5 (p). This says that the row vector (nN; --- m,) is the 
, i=l 4 
product of the row vector (& --- &,) by the matrix [FC p)). Taking the 
J 


transpose of this matrix equation, we see that the transition functions for the 
cotangent bundle are to be 


Be (P) = [FEC], 


i.e., the transpose inverses of the transition functions for the tangent bundle. 
In view of the boxed formula earlier in this section, a system of functions 
tc 1 M, x F” — S arises from a globally defined function on the cotangent 
bundle if and only if 


Fels, €) = fe(y@), [BX]"@), 


Le., if and only if 
fe(x), (2T')"@) = fe, 0). 


If 7 : B — M is a smooth vector bundle, a section of B is a function 
s : M — B such that z(s(p)) = p for all p € M, and the section is a smooth 
section if s is smooth as a function between smooth manifolds. 


Proposition 8.16. Let 7 : B — M be a smooth vector bundle of rank n, 
let s : M — B bea section, and let « be a compatible chart for M. Then the 
coordinate function ¢, has the property that ¢- los(p) = (p, vy, (p)) for p in M, 
and for a function u,(-) : M, — F”. Moreover, the section s is smooth if and 
only if the function p +> v,(p) is smooth for every chart « in an atlas. 


PROOF. Let P, : M, x F” — M, be projection to the first coordinate. Let us 
check that P,, og! = zonz!(M,). Suppose that p isin M, and ¢,(p, v) = b. 
Applying z gives 1(b) = 1¢,(p,v) = p by the defining property (i) of ¢,. 
Therefore 6-'(b) = (p, v) and P.¢-!(b) = p = m(b). Since b is arbitrary in 
m'(My), hp obo' =n. 

For a section s, the condition z os = 1 on M therefore implies that P, o¢, los 
= 1 on M,. Hence ¢,! 0 S(p) = (P, Ue (p)) for p in M,. and for some function 
v, : M, — F". Since each ¢, : My, x F” > 27!(M,) isa diffeomorphism, s is 
smooth if and only if each function ¢-! o s is smooth for « in an atlas, and this 
condition holds if and only if each v, is smooth. 
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EXAMPLES. 
(1) Vector fields. A vector field on M is a section of the tangent bundle. In 


n 
the first example in this section, we obtained the formula ¢, (p, v) = >> v; Lael 


if pisin M, and v = (v1,..., U,). Applying ¢, to the formula of Proposition 
8.16, we see that s(p) = ¢&(p, v(p)) = 3 v(P)[ax |, if the function v(p) is 
(v1 (p),.--, Un(p)). On the other hand, Prownenian 8.8 shows that any vector 
field X acts by Xf(p) = 3 of (p)(Xx;)(p). If we regard X as our section s, 


we see therefore that v; (p) = (Xx;)(p). Since s is smooth if and only if all 
v;(p) are smooth, s is smooth if and only if all (Xx;)(p) are smooth. In view of 
Proposition 8.8, we conclude that a vector field is smooth as a section if and only 
if it is smooth in the sense of Section 2. 


(2) Differential 1-forms. A differential 1-form on M isa section of the cotan- 
gent bundle. Just before Proposition 8.16 we obtained the formula ¢,(p, €) = 


>= & dx;(p) if p is in M, and & = (&,...,&,). Applying ¢, to the formula 
i=l 


of Proposition 8.16, we see that s(p) = dx (p, &(p)) = >> &(p) dxi(p) if the 
i=1 


function €(p) is (€1(p), ..., &.(p)). Proposition 8.16 shows that s is smooth if 
and only if all the &;(p) are smooth, and thus a differential 1-form is smooth if 


and only if in each of its local expressions )> &;(p) dx;(p), all the coefficient 
i=l 
functions &(p) are smooth. In particular Proposition 8.15 gives the formula 


df(p=> af (p) dx;(p) whenever f is a smooth function on M,, and therefore 


df isa smooth differential 1-form on M whenever f isinC™(M),. 


5. Distributions and Differential Operators on Manifolds 


The goal of Sections 5—7 is to describe the framework for extending the method 
of pseudodifferential operators, as introduced in Section VII.6, from open subsets 
of Euclidean space to smooth manifolds. Just as in Section VII.6 a number of 
lengthy verifications are involved, and we omit them. 

Several sources of examples with F = R are worth mentioning. All of 
them come about in the context of some smooth manifold with some additional 
structure. All of them involve differential operators, as opposed to general pseu- 
dodifferential operators, at least initially. From this point of view, the reason 
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for introducing pseudodifferential operators is to have tools for working with 
differential operators. 

The first source is the subject of “Lie groups.” A Lie group G is a smooth 
manifold that is a group in such a way that multiplication and inversion are smooth 
functions. Closed subgroups of GL(n, F) furnish examples, but not in an obvious 
way. In any event, if a tangent vector at the identity is moved to arbitrary points 
of G by the differentials of the right translations of G, the result is a vector 
field that can be shown to be smooth and to have an invariance property relative 
to left translation. We can regard this left-invariant vector field as a first-order 
differential operator on G. Out of such operators we can form further differential 
operators by forming compositions, sums, and so on. 

A related and larger source is quotient spaces of Lie groups. Any Lie group G 
is a locally compact group in the sense of Chapter VI. If H is a closed subgroup, 
then the quotient G/H turns out to have a smooth structure such that the group 
action G x G/H — G/H is smooth. The quotient G/H may admit differential 
operators that are invariant under the action of G. For example the Laplacian 
makes sense on the unit sphere $”~! and is invariant under rotations. The sphere 
S"—| is the quotient of rotation groups SO(n) /SO(n — 1), and thus the Laplacian 
on the sphere falls into the category of an invariant differential operator on a 
quotient space of a Lie group. 

A third source, overlapping some with the previous two, is Riemannian ge- 
ometry. A Riemannian manifold M is a smooth manifold with an inner product 
specified on each tangent space T,(M) so as to vary smoothly with p. The 
additional structure on M is called a Riemannian metric and can be formalized, 
by the same process as for the tangent bundle itself, as a smooth section of a 
suitable vector bundle over M. A Riemannian manifold carries a natural Laplacian 
operator and other differential operators of interest that capture aspects of the 
geometry. One way of creating Riemannian manifolds is by embedding a smooth 
manifold of interest in a Riemannian manifold. For example one can embed 
any compact orientable 2-dimensional smooth manifold in 3, and R? carries a 
natural Riemannian metric. The inclusion of the manifold into R? induces an 
inclusion of tangent spaces, and the Riemannian metric of R? can be restricted to 
the manifold. 

A fourth source is the field of several complex variables. The Cauchy—Riemann 
operator, consisting of e in each complex variable z;, makes sense on any 
open set, and the functions annihilated by it are the holomorphic functions. If a 
bounded open subset of C” has a smooth boundary, then the tangential component 
of the Cauchy—Riemann operator makes sense on smooth functions defined on 
the boundary. The significance of the tangential Cauchy—Riemann operator is 
that the functions annihilated by it are the ones that locally have extensions to 
holomorphic functions in a neighborhood of the boundary. The Lewy example, 
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mentioned in Section VII.2, ultimately comes from such a construction using the 
unit ball in C?. 


The subject being sufficiently rich with examples, let us establish the frame- 
work. Let M be an n-dimensional smooth manifold. It is customary to assume 
that M is separable. This condition is satisfied in all examples of interest, and in 
particular every compact manifold is separable. With the assumption of separa- 
bility, we automatically obtain an exhausting sequence {K;}°°, of compact sets 
such that M =); Kj and Kj € K?.,. 

We have already introduced the associative algebras C°(M) and CX. (M), 
and these spaces of functions need to be topologized. For C°(M), the topology 
is to be given by a countable separating family of seminorms, and convergence is 
to be uniform convergence of the function and all its derivatives on each compact 
subset of M. The exact family of seminorms will not matter, but we need to see 
that it is possible to specify one. Fix K;. To each point p of K;, associate a chart 
kK, about p and associate also a compact neighborhood N, of p lying within M,,. 
For p in K;, the interiors N - of the NV,’s cover K;, and we select a finite subcover 
No Sead No Let Kp,,..., Kp, be the corresponding charts. If gy is in C°(M), 
the seminorms of ¢ relating to K; will be indexed by a multi-index a and an 
integer 7 with 1 <i <r, the associated seminorm being sup, . Np, |D°(pok,, |. 
When j is allowed to vary, the result is that C°°(M) is a complete metric space 
with a metric given by countably many seminorms. If we construct seminorms by 
starting from a different exhausting sequence, then there is no difficulty in seeing 
that any seminorm in either construction is < a positive linear combination of 
seminorms from the other construction. Thus the identity mapping of C°(M) 
with the one metric to C°(M) with the other metric is uniformly continuous. 

For C&_,(M), we use the inductive limit construction of Section IV.7 relative to 
the sequence of compact subsets K;. That is, we let C K, be the vector subspace of 
functions in CX,(/) with support in K;, we give C K, the relative topology from 
C™(M), and then we form the inductive limit. Again the topology is independent 
of the exhausting sequence, and C3<,,(M) is an LF space in the sense of Section 


com 
IV.7. 


The next step is to introduce distributions on manifolds, and there we encounter 
an unpleasant surprise. In Euclidean space the effect (T, g) of a distribution on 
a function was supposed to generalize the effect (f, ~) = { fy dx of integration 
with a function f. The dx in the Euclidean case refers to Lebesgue measure. To 
get such an interpretation in the case of a manifold M, we have to use a measure on 
M, and there may be no canonical one. If we drop any insistence that distributions 
generalize integration with a function, then we encounter a different problem. The 
problem is that the three global notions —smooth function, distribution, and linear 
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functional on smooth functions — each have to satisfy certain transformation rules 
as we move from chart to chart, and these transformation rules are not compatible 
with having the space of distributions coincide with the space of linear functionals 
on smooth functions. 

There are several ways of handling this problem, and we use one of them. What 
we shall do is fix a global but noncanonical notion of integration on M satisfying 
some smoothness properties. Thus we are constructing a positive linear functional 
X on Coom(M). We suppose given relative to each chart k = (xj,...,Xn) a 
positive smooth function g,(x) on M, such that A(g) = i Mh p(k! (x)) g(x) dx 
whenever ¢ is in Coom(M,.). Let «’ = (y1,.--, Yn) be a second chart, and put 
Mie = MO My. TE @ is in Coom(M xx’), then we require that 


Sey) p(k! (x))8e(x)dx = Se Mg) p(k! (y)) 8x (y) dy. 


Substituting y = «’(«~!(x)) on the right side, we can transform the right side into 
eau ) p(k "(x)) ger (K(k "(x)))| det [iO] | dx by the change-of-variables 
formula for multiple integrals. Thus the compatibility condition for the functions 


&x is that 
8X) = se(y@)|det[S2O)]| — forx € (Mex), YQ) = KK). 


Conversely if this compatibility condition on the system of g,.’s is satisfied, we 
can use a smooth partition of unity® to define A consistently and obtain a measure 
on M. This measure is a positive smooth function times Lebesgue measure in the 
image of any chart, and we refer to it as a smooth measure on M. We denote it 
by jg. The key formula for computing with it is 


Su 9 dig = Sig, GE '@)) 8x (x) dx 


for all Borel functions g > 0 on M that equal 0 outside M,. 

One can prove that a smooth measure always exists,’ and there are important 
cases in which a distinguished smooth measure exists. With Lie groups, for 
example, a left Haar measure is distinguished. With the quotient of a Lie group 
by aclosed subgroup, Theorem 6.18 gives a necessary and sufficient condition for 
the existence of a nonzero left-invariant Borel measure, and that is distinguished. 
With a Riemannian manifold, there always exists a distinguished smooth measure 
that is definable directly in terms of the Riemannian metric. 


8Smooth partitions of unity are discussed in Problem 5 at the end of the chapter. 

°Tf every connected component of M is orientable, there is a positive smooth differential n-form, 
and it gives such a measure. All components are open; any nonorientable component has an orientable 
double cover with such a measure, and this can be pushed down to the given manifold. 
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The smooth measure is not unique, but any two smooth measures jg and 4p, 
are absolutely continuous with respect to each other. By the Radon—Nikodym 
Theorem we can therefore write du, = F dj; for a positive Borel function F’; 
the function F may be redefined on a set of measure 0 so as to be in C™(M), 
as we see by examining matters in local coordinates. Conversely if F is any 
everywhere-positive member of C°(M), then F diz, is another smooth measure. 

If we fix a smooth measure jug, we can define spaces L1,,,(M, W,) and 
Lig(M , 4g) as follows: the first is the vector subspace of all members of 
L'(M, 11.) with compact support, and the second is the vector space of all 
functions, modulo null sets, whose restriction to each compact subset of M 
is in Lh n(M , 4). It will not be necessary for us to introduce a topology on 
Liom(M, tg) or on Li.(M, 14g). If we replace jug by another smooth mea- 
sure dup, = F djtg, then it is evident that L1..(M, Mn) = Lbyn(M. We) and 
Lige(M, Ln) = Lie(M, Lg). 

We define D’(M) and €'(M) in the expected way: D’(M), which is the space 
of all distributions on M, is the vector space of all continuous linear functionals 
on C&.(M), and €’(M) is the vector space of all continuous linear functionals 
onC™(M). The effect of a distribution T on a function g continues to be denoted 
by (T, g). The support of a distribution is the complement of the union of all 
open subsets U of M such that the distribution vanishes on CS>,(U). We omit the 
verification that €’(M) is exactly the subspace of members of D’(M) of compact 
support. It will not be necessary for us to introduce a topology on D’(M) or 
E'(M). 

With the smooth measure ju, fixed, we can introduce distributions Ty corre- 
sponding to certain functions f. If f is in Lie(M , Lg), we define Ty by 


(Tr, 9) = fy FO de forg € C&(M). 


This is a member of D’(M). If f is in L1,,,(M, le), we define Ty by 
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(Ty, 9) = fy fedug forge C*(M). 
This is a member of €’(M). 


As we did in the Euclidean case in Section V.2, we want to be able to pass from 
certain continuous linear operators L on smooth functions to linear operators on 
distributions. With jz, replacing Lebesgue measure, the procedure is unchanged. 
We have a definition of L on functions, and we identify a continuous transpose 
operator L™ on smooth functions satisfying the defining condition 


ie L(f)pdug = Loe fL"g) dg. 
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Then we let 
(L(T), g) = (T, L"(g)). 


For example, if L is the operator given as multiplication by the smooth func- 
tion y, then L™ = L on smooth functions because we have sig L(f)gdpg = 


Su VA@) dig = fy( PWG) de = fy FLY) dig. Thus the definition is 
(wT, g) = (T, Wo). 


A linear differential operator L of order < m onamanifold M isa continuous 
linear operator from C™(M) into itself with the property that for each point p 
in M, there is some compatible chart « about p and there are functions ay in 
C™(M,,) such that the operator takes the form Lf(q) = ae Ay (gq) D°f (q) 
for all f inC°(M,). Here ifk = (x1,...,%,), then D°f(q) is by definition the 
Euclidean expression D®(f 0 k~!)(x1,...,X,) evaluated at K(q). 

If we have an expansion Lf (q) = Paes Ay (q)D°f (qg) in the chart « about p 
and if «’ is another compatible chart about p, then a Euclidean change of variables 
shows that Lf (q) is of the form ae dp (q) DP? f (q) in the chart «’ for suitable 
smooth coefficient functions dg. 

The operator L carries the vector subspace C&°’_(M) of C®(M) into itself and 


com 
is continuous as a mapping of C&S, (M) into itself. One says that L has order m 


com 
if in some compatible chart, some coefficient function ay is not identically 0. 

Let us compute how the transpose of a linear differential operator of order 
m acts on smooth functions. The claim is that this transpose is again a linear 
differential operator of order m. Since linear differential operators on open subsets 
of Euclidean space are mapped to other such operators by diffeomorphisms, it is 
enough to make a computation in a neighborhood of a point p within a compatible 
chart « about p. Evidently the operation of taking the transpose is linear and 
reverses the order of operators, and we saw that multiplication by a smooth 
function is its own transpose. Thus it is enough to verify that the transpose of a 
is a linear differential operator. 

To simplify the notation in the verification, let us abbreviate (7, y) as (f, ¢) 
when f and g are smooth functions on M and at least one of them has compact 
support. That is, we set (f, g) = tas fod. Let g and wy be in C™(M,), and 
assume that one of g and w has compact support. With {g,} as the system of 
functions defining the smooth measure jg, we have 


Sia, H( oK YG oKN) gy) dx =0. 
Expanding the derivative and setting h, = g, 0K gives 
: a 
((si)"9. ¥) =(%. 5E) 


= Jin, PO") FeV OK") Bex) dx 
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= Sig w(«!(x)) a ok !)g,)(x) dx 
= — fig, 82) WOH) gE ((Y 0 «Be )X) Be (x) dx 
= — fig, Me OK NO) No RNR) Be ((Y OK Me 0 KN) O) Be) dx. 


Therefore (;-.)"¢ = (hy) WX (on), and (= is exhibited as a linear dif- 
J J J 
ferential operator in local coordinates. 

Certainly transpose does not increase the order of a linear differential operator. 
Applying transpose twice reproduces the original operator, and it follows that the 
transpose differential operator has the same order as the original. 

If L is a linear differential operator acting on C&>,(M) or C® (M), we are now 
in a position to extend the definition of L to distributions. To do so, we form the 
linear differential operator L™ such that (Ly, w) = (g, L"w) whenever gy and y 
are smooth on M and at least one of them has compact support. If T isin D'(M), 
we define L(T) in D'(M) by (L(T), g) = (T, Le) for g in C&S (M). If T is in 


E'(M), then we can allow g to be C®(M), and the consequence is that L(T) is 
in €’(M). Thus L carries D’(M) to itself and €’(M) to itself. 


Recall from Section VII.6 that a linear differential operator bay <m Ga(x)D® 
of order m has, by definition, full symbol ial <m 4a (x)(2mi)!“lé% and principal 
symbol ae dy(x)(2ri)'“'E%, with the factors of 277i reflecting the way that 
the Euclidean Fourier transform is defined in this book. When we try to extend 
this definition in a coordinate-free way to smooth manifolds M, we find no ready 
generalization of the full symbol, but we shall see that the principal symbol 
extends to be a certain kind of function on the cotangent bundle of M. 

Let L be a linear differential operator on M of order m. Fix a point p in M, 
let kK = (x1,...,%X,) be a compatible chart about p, and let g be in C®(M,). 
Suppose that D% makes a contribution to Z in this chart. Fort > 0 and f in 
C™(M,,), consider the expression 


ge PEE Deer Py evaluated at p. 


We are interested in this expression in the limit t — oo. When D® (e279 F) is 
expanded by the Leibniz rule, each derivative that is applied to e?7'? yields a 
factor of t, and each derivative that is applied to f yields no such factor. Moreover, 
the exponentials cancel after the differentiations. The surviving dependence on 
t in each term is of the form t~’, where r > m — |a|. Thus our expression 
has limit 0 if |a| < m. If |a| = m, we get a nonzero contribution only when 
all the derivatives from the Leibniz rule are applied to f. Thus the limit of our 
expression with |a| = m is of the form c D“f (p), where c is a constant depending 
on a and the germ of ¢ at p. 
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Meanwhile, our expression is unaffected by replacing y by g — g(p), and its 
dependence on ¢ is therefore as a member of Ce A little checking shows that 
our expression is unchanged if a member of CS is added to gy. Consequently our 
expression, for a fixed with |a| = m, is a function on CaiCy = T;(M). 

Let us write a general member of T*(M) as (p, &). We define the principal 
symbol of the linear differential operator L of order m to be the scalar-valued 
function 0, (p, €) on the real cotangent bundle T7*(M, R) given by 


o1(p, €)f (p) = him Me M LC" f)(p), 


where ¢ is chosen so that dg(p) = &. Reviewing the construction above, we see 
that this definition is independent of f and of any choice of local coordinates. 

We can compute the principal symbol explicitly if an expression for L is 
given in local coordinates. With our chart k = (x1,...,X,) aS above, we know 
from Proposition 8.15 that the differentials dx1(p),...,dxn(p) form a basis 
of T,(M ). Let the expansion of the given cotangent vector &€ in this basis be 
— = )0; & dxi(p), and define g(x) = )°; & (x; — xi(p)). This function has 
dg(p) = & by Proposition 8.15, and direct computation gives 


o1(p,€)= Yo aa(x)Qriylg* if L = YY ag(x)D*. 


|al=m |a|<m 


In particular, 0, (p, €) is homogeneous of degree m in the € variable .!° 
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Before introducing pseudodifferential operators on an n-dimensional separable 
smooth manifold M, it is necessary to supplement the Euclidean theory as pre- 
sented in Section VII.6. We need to understand the effect of transpose on a 
Euclidean pseudodifferential operator and also the effect of a diffeomorphism. 

First let us consider transpose. If G is a pseudodifferential operatoron U C R", 
we know that 


(G"w, p) = (, Ge) = fan Sy Sy Or? 8 (x, EW (x) (y) dy dx dé 


for g and yw in CS..(U). If we interchange x and y and replace € by —&, we 
obtain 


(G™W, 0) = fon Sy Sy Ome 8, —E)W (g(x) dy dx dé. 


104 function o(p, ) is homogeneous of degree m in the é variable if o(p,ré) = ro (p, €) 
for allr > Oandallé 40. 
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The function that ought to play the role of the symbol of G" is g(y, —&). Ithas a 
nontrivial y dependence, unlike what happens with pseudodifferential operators 
as defined in Section VII.6. Thus we cannot tell from this formula whether 
G" coincides with a pseudodifferential operator. Although it is possible to 
cope with this problem directly, a tidier approach is to enlarge the definition 
of pseudodifferential operator to allow dependence on y, as well as on x and é, 
in the function playing the role of the symbol. Then the transpose of one of the 
new operators will again be an operator of the same kind, and one can develop 
a theory for the enlarged class of operators.'' Remarkably, as we shall see, the 
new class of operators turns out to be not so much larger than the original class. 

Accordingly, let Si") )(U x U) be the set of all functions g in C°(U x U x R") 
such that for each compact set K C U x U and each triple of multi-indices 
(a, B, y), there exists a constant C = Cx.,g,, with 


|DED? DY g(x, y,8)| <CU+ |g" for (x, y) € K andé ER". 


Then D¢ pf Dj}, g will be a symbol in the class Sia, (U x U). Let S;gQ(U x U) 
be the intersection of all S;5)(U x U) forn > 0. A function g(x, y, €) in 
Si'o.9(U x U) is called an amplitude, and the generalized pseudodifferential 
operator that is associated to it is given by!” 


GS i . [ PME-WDE g(x, y, E)g(y) dy dé 


for g in C&_(U). Such an operator is continuous from Cgo,,(U) into C°(U). 
The transposed operator G™ such that (Gg, ¥) = (vg, G"w) for g and y in 


CxS (U) is given by 


G"9(x) = i / POF oy, x, Ey) dy dé, 
nJU 


which becomes an operator of the same kind when we change into —. Because 
of the displayed formula for G" g(x), we are led to define 


ap.) =(f ff eme-O*e0, Baya] 


'lThe theory for the new operators is the “tidier and faster” approach to Euclidean pseudo- 
differential operators that was mentioned just before the statement of Theorem 7.20. 

The use of the word “generalized” here is not standard terminology. It would be more standard 
to use some distinctive notation for the class of operators of this kind, but we have introduced no 
notation for it at all. 
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for f ¢ EU) and g € CS. (U). Then Gf is in D’(U). In the special case that 
g is independent of its second variable, the above formula for (Gf, g) reduces to 
the formula for (Gf, g) in Section VII.6 as a consequence of Theorem 5.20 and 
an interchange of limits.!? 

If the amplitude of G is in S,_ 0.0 U x U), then the generalized pseudodiffer- 
ential operator G carries €'(U) into C°(U), and it is consequently said to be a 
smoothing operator. 

Following the pattern of the development in Section VII.6, we define a linear 
functional Gon CSS (U x U) by the formula 


wp= fo Lf eortacr y,Eowt yyde dy] ae. 
i UxU 
Then G is continuous and hence is amember of D’(U x U). The formal expression 


G(x, y) = / eM @-Y)S a(x, y, &) dé 
R" 


is called the distribution kernel of G; again it is not to be regarded as a function 
but as an expression that defines a distribution. 

With the insertion of the word “generalized” in front of “pseudodifferential 
operator,’ Theorem 7.19 remains true word for word; the distribution kernel is a 
smooth function off the diagonal in U x U, and the operator is pseudolocal. 

We extend the definition of properly supported from pseudodifferential op- 
erators to the generalized operators. Examining the extended definition along 
with the formula for the distribution kernel, we see that G is properly supported if 
and only if G" is properly supported. The main theorem concerning generalized 
pseudodifferential operators is as follows. 


Theorem 8.17. For U open in R”, let G be the generalized pseudodifferential 
operator corresponding to an amplitude g(x, y, €) in Sj’) 9(U x U), and suppose 
that G is properly supported. Then 


(a) G is the pseudodifferential operator with symbol 
BEV Serre Gerr ey “nS ), 


(b) the symbol g(x, €) has asymptotic series 


Ani) la! 
gé)~ >> a DED“ g(x, y,8)|,_,- 


! 
a 


'5This discussion therefore completes the justification of the definition of (Gf, y) in Section 
VIL6. 
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In (a) of Theorem 8.17, the fact that G is properly supported implies that G 
extends to be defined on C °(U), and e27'-)§ is a member of this space. The 
operator G with symbol g(x, €) as in (a) is given by 


Go(x) = fan OTF g(x, EVG(E) dE = fan GP) GLE) dé, 


and the assertion in (a) is that this equals Gg(x). Consequently the assertion 
is that if G is applied to the formula g(x) = fp, ce?" @(E) dé, then G may 
be moved under the integral sign. This interchange of limits is almost handled 
pointwise for each x by Problem 5 in Chapter V, but we cannot take the compact 
metric space K in that problem to be all of R”. Instead, we take K to be a large 
ball in R”, apply the result of Problem 5, and do a passage to the limit. 

The proof of (b) is long but reuses some of the omitted proof of Theorem 
7.20. In the course of the argument, one obtains as a byproduct a conclusion that 
does not make use of the hypothesis “properly supported.” Theorem 8.18 may 
be regarded as an extension of Theorem 7.22a to the present setting. 


Theorem 8.18. For U open in R”, let G be the generalized pseudodifferential 
operator corresponding to an amplitude in Sj’) 9(U x U). Then there exist a 
pseudodifferential operator G; with symbol in Sj')(U) and a generalized pseudo- 
differential operator Gz corresponding to an amplitude in S; ¢(U x U) such that 
G=G,+G.. 


In any event, Theorem 8.17 is the heart of the theory of generalized pseu- 
dodifferential operators in Euclidean space, and most other results are derived 
from it. It is immediate from Theorem 8.17 that if G is a properly supported 
pseudodifferential operator as in Chapter VII with symbol g(x, &) in S7')(U), 
then so is G", and furthermore the symbol g"(x, €) has asymptotic series 


Omni) lel 
a", 6)~ >> a Dz Dig (x, —§). 


In the treatment of composition, the result is unchanged from Theorem 7.22b, 
but the use of amplitudes greatly simplifies the proof. In fact, let G and H be 
two properly supported pseudodifferential operators with respective symbols g 
and h, and let h" be the symbol of H". Since H = (H")", we have 


Hox) = far fy OO Fh(y, -E)9(y) dyd— — forg € CH (U). 


Using Fourier inversion, we recognize this formula as saying that Avg ) 
te e v5 (y, —E)p(y) dy. Substituting ~¥ = H¢ in the formula Gy (x) 


ive ert a(x, E)W(E) dé therefore gives 
GHo(x) = fon fy Or * a(x, EA" (y, —E) 90) dy dé. 


II 
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We conclude that GH is the generalized pseudodifferential operator with ampli- 
tude g(x, €)h"(y, —€). Applying Theorem 8.17b and sorting out the asymptotic 
series that the theorem gives, we obtain a quick proof of Theorem 7.22b. 


We turn to the effect of diffeomorphisms on Euclidean pseudodifferential op- 
erators. Let ® : U + U* bea diffeomorphism between open subsets of R”, and 
suppose that a generalized pseudodifferential operator G : Co5,,(U) — C™(U) 
is given by 


Goa) = fon fy OO B(x, y, E)9(y) dy dé 


for g in C&°_,(U). We define G* to be the operator carrying CS, (U*) to C~(U*) 


and given by 


Gly =(G(Wo%))oo!  forw Ee C® (U"). 


com 


Our objectives are to see that G* is a generalized pseudodifferential operator, to 
obtain a formula for an amplitude of it, and to examine the effect on symbols. 

Let us put x* = ®(x) and y# = ®(y). Put ©; = ©7!. Direct use of the 
change-of-variables formula for multiple integrals gives 


Gh (a") = GO 0 ®)(X) = fn fy OM” B(x, y, EY ((y)) dy dé 
= fron {pee P-L) § 9 (Dy (x4), By (y*),E)W(y*)| det((®1)'(y*))| dy*dé. 
The hard part in showing that the expression on the right side is a generalized 


pseudodifferential operator is to handle the exponential factor. The starting point 
is the formula 


@ (x*) — Oi(y*) = fy (O)/ (tx? + UL — Dy at — y*)dt, 
which is valid if the line segment from x* to y* lies in U* and which follows from 
the directional derivative formula and the Fundamental Theorem of Calculus. 


From that, one derives the following lemma. 


Lemma 8.19. About each point X = (p*, q*) of U* x U*, there exist an open 
neighborhood Ny and a smooth function Jy : Ny — GL(n, F) such that 


D1 (x") — Oy (y") = Jy Qx*, y"r* — y*) 


for every (x*, y*) in Ny. 
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The lemma allows us to write e271) P10") — e2Mi "yO" yO" ®) for 
(x#, y#) in Nx. Thus locally we can convert the integrand for G*y(x*) into the 
integrand of a generalized pseudodifferential operator. It is just a question of 
fitting the pieces together. Using an exhausting sequence for U* and a smooth 
partition of unity,!* one can find a sequence of points X; and smooth functions 
h; with values in [0, 1] such that 4; has compact support in Nx,, such that each 
point of U* x U* has a neighborhood in which only finitely many h; are nonzero, 
and such that a h; is identically 1. Let J; be the function Jy, of the lemma. 
Sorting out the details leads to the following result. 


Theorem 8.20. If & : U — U* is a diffeomorphism between open sets in 
R’, if G : C&S,(U) > C®(U) is the generalized pseudodifferential operator 
with amplitude g(x, y,€) in Sig 9(U x U), and if G* is defined by Gy = 
(G(W o ®)) o |, then G* is the generalized pseudodifferential operator on U* 


with amplitude 
gh(x*, y*,n) = |det(@')'@*)| 
S(O J ld GP ee Gay) OD) 
i 


in Sgt x U*), where x = ®7!(x*) and y = ©7!(y*). If G is properly 
supported, then so is G*. 


Under the assumption that G and G* are properly supported and G has symbol 
g(x, &), let us use Theorem 8.17 to compute the symbol of G*, starting from the 
formula in Theorem 8.20. For that computation all that is needed is the values 
of g#(x*, y*, n) for (x*, y“) in any single neighborhood of the diagonal, however 
small the neighborhood. 

In Lemma 8.19, one can arrange for a single Nx, say the one for X = X1, to 
contain the entire diagonal of U* x U*. The point X; can be one of the points 
used in forming the partition of unity, and the corresponding function h; can be 
arranged to be identically 1 in a neighborhood of the diagonal. Thus for purposes 
of computing the symbol, we may drop all the terms for 7 4 1 and write the 
formula of Theorem 8.20 as 

tort, yn) © |det(—!)'x*)|| det i", y*) 1a x, 1%, y(n). 
Theorem 8.17b says that g#(x,) ~ 0, 22 DED gt Gey i) |ecue The 
term for w = 0 in Theorem 8.17 comes from taking y* = x* in g#(x*, y*, n). 
The function J; simplifies for this calculation and gives J, Oe) = (01 
Let us summarize. 


'4Smooth partitions of unity are discussed in Problem 5 at the end of the chapter. 
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Corollary 8.21. If 6 : U > U* isa diffeomorphism between open sets in R”, 
if G : CS,(U) > C°(U) is a properly supported pseudodifferential operator 
with symbol g(x, &) in Si")(U), and if G* is defined by G*yy = (G(yo®))o "|," 
then G* is a properly supported pseudodifferential operator on U*,, and its symbol 
g#(x*, n) has the property that 


gt (x*n) — 9(® 1%), (CO TN'O*)')"()) 


is in S1'9'(U*). 


7. Pseudodifferential Operators on Manifolds 


With the Euclidean theory and the necessary tools of manifold theory in place, 
we can now introduce pseudodifferential operators on manifolds. Let M be an 
n-dimensional separable smooth manifold. A typical compatible chart will be 
denoted by k : M, — M,, where M, is open in M and M, is open in R”. Fix 
a smooth measure fg on M as in Section 5, and let (¢, ¢2) = he P192d [hg 
whenever ¢ and @ are in C™(M) and at least one of them has compact support. 

A pseudodifferential operator on M is going to be a certain kind of continuous 
linear operator G from C&°_,(M) into C°(M). The operator G™ : CX (M) > 
C®(M) such that (Gy, ¢2) = (g1, G"g2) for g, and @ in CXm(M) will be 
another continuous linear operator of the same kind, and therefore the definition 


(G(T), ~) = (T, G"(g)) fore € CX (M) andT € €’(M) 


com 
extends our G to a linear function G : €’(M) > D’(M) ina natural way. 

For any continuous linear operator G : C&.(M) — C*(M), the scalar- 
valued function (Gg, g2) on CS,(M) x C,,(M) is continuous and linear in 
each variable when the other variable is held fixed, and it follows from a result 
known as the Schwartz Kernel Theorem!> that there exists a unique distribution 


Gin D'(M x M) such that 


(Gq), $2) = (G, 91 8 (2) for Pie Coom 4) and G2€E Conk )s 


where ¢ © ¢2 is the function on M x M with (¢1 ® ¢2)(x, y) = g1(x)g2(y). We 
call G the distribution kernel of G. The distribution kernel G" of G" is obtained 
from the distribution kernel G by interchanging x and y. 

In analogy with the Euclidean situation, we say that G is properly supported 
if the subset support(G) of M x M has compact intersection with K x M and with 


'S special case of the Schwartz Kernel Theorem is proved in Problems 14~19 at the end of 
Chapter V. This special case is at the heart of the matter in the general case. 
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M x K for every compact subset K of M. In this case it follows for each compact 
subset K of M that there exists a compact subset L of M such that G(Ce) C CP. 
Concretely the set L is p, ((M x K)N support(G)), where pi(x, y) = x. Then 
it is immediate that G carries C3,,(M) into CS, (M) and is continuous as such 
a map. The same thing is true of G™ since the definition of proper support is 


symmetric in x and y, and therefore the definition 


(G(T), v) = (T, G"(@)) for g € C® (M) and T € D’(M) 


com 


extends the properly supported G to a linear function G : D'(M) > D’(M) in 
a natural way. 

A pseudodifferential operator of order < m on M is a continuous linear 
operator G : C%,(M) > C°(M) with the property, for every compatible chart 


«, that the operator G, : CS’ (M,.) > C®(M,) given by 


com 


com(M.) 


Gh) = Gh ok) y ox! forwec 


is a generalized pseudodifferential operator on M,. defined by an amplitude in 
ST'o.9( Mx x M,). Theorem 8.20 shows that this condition about all compatible 
charts is satisfied if it holds for all charts in an atlas. 

For such an operator the distribution kernel is automatically a smooth function 
away from the diagonal of M x M, as a consequence of the same fact about 
Euclidean pseudodifferential operators. One has only to realize that if two distinct 
points of M are given, then one can find compatible charts about the points whose 
domains are disjoint and whose images are disjoint; then the union of the charts 
is a compatible chart, and the fact about Euclidean operators can be applied. 

For a distribution on a smooth manifold, it makes sense to speak of the singular 
support as the union of all open sets on which the distribution is a smooth function, 
and the above fact about the distribution kernel implies that any pseudodifferential 
operator G on M is pseudolocal in the sense that the singular support of G(T) 
is contained in the singular support of T for every T in €’(M). 

The composition of two properly supported pseudodifferential operators on 
M is certainly defined as a continuous linear operator from CX, (M) into itself, 
but a little care is needed in checking that the composition, when referred to 
a compatible chart x, is a generalized pseudodifferential operator on M,. The 
reason is that when G is properly supported on M, it does not follow that the 
restriction of G to M,, i.e., to CoS,,(M,.), is properly supported, not even if M is 
an open subset of IR”. To handle this problem, we start from this observation: if 
G is any pseudodifferential operator on M, if V is openin M, and if yr, and yw are 
in C&,(V), then the operator defined for g in CSV, (V) by gp  WiG(Y2¢) isa 
properly supported pseudodifferential operator on V; in fact, the distribution ker- 
nel of this operator is supported in the compact subset support(y2) x support(W1) 
of Vx V. 
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This observation, the device used above for showing that distribution kernels 
are smooth off the diagonal, and an argument with a partition of unity yield a 
proof of the following lemma. 


Lemma 8.22. If L is a properly supported pseudodifferential operator on M 
of order < m and K is a compact subset of M, for some compatible chart « of 
M, then there exist compatible charts ko, k1,..., «, With ko = K, with each M,, 
containing K and, for eachi > O, with a properly supported pseudodifferential 
operator L; on M,, such that L(g) = ue Li(@) for every g in CR. 


PROOF. Choose K’ compact such that g € CX implies L(y) € C§%, and let 
w > 0 bea member of Coo. (M) that is 1 in a neighborhood of K’. Next choose 
open neighborhoods N,N’, N” of K suchthat N” CN”! CN’ CNOCN & 
NC M, with N“ compact. Finally choose yr € C&,,(M) with values in [0, 1] 
that is 1 on N’ and is 0 on N°. Then 1 — yw is 0 on N’ and hence has support 
disjoint from K. Define w. = (1 — Wr) vv. 

For each x in the compact support of yo, find a compatible chart containing 
x with domain V, contained in N°. The sets V, cover support(w), and there 
is a finite subcover Vj,..., V,. Since each V; with i > 1 is the domain of a 
compatible chart and since V; 1 N” = @, there exists a compatible chart x; 
with domain V; U N”. Within the sets V;, we can find open subsets W; with wy! 
compact in V; such that the W; cover support(w2). Repeating this process, we can 
find open subsets X; with X‘! compact in W; such that the X; cover support(y2). 
By choosing, for each i, a smooth function on UV; with values in [0, 1] that is 1 on 
X; and is 0 off W<" and by then dividing by the sum of these and a smooth function 
that is positive on UV; — UW; and is 0 in a neighborhood of support(y), we can 
produce smooth functions 7;,..., 7, on UV;, all > 0, with sum identically 1 in 
a neighborhood of support(w2) such that 7; has compact support in V;. Then the 
operators Lo(g) = wL(W1¢) and, fori > 1, L;(g) = niWoLl(Wi¢) have the 
required properties. 


If we have a composition J = GH of properly supported pseudodifferential 
operators, we apply the lemma to H to write GH(g) = )¢; G(Ai(g)). For 
each i, all members of Hj(C¥) have support in some compact subset L; of 
M,,. Thus we can apply the lemma again to G and the set L; to write G as a 
certain sum in a fashion depending on i. The result is that GH is exhibited on 
Ce as a sum of terms, each of which is the composition of properly supported 
operators within a compatible chart. Since compositions of properly supported 
generalized pseudodifferential operators in Euclidean space are again properly 
supported generalized pseudodifferential operators, each term of the sum is a 
pseudodifferential operator on M. Thus J = GH isa pseudodifferential operator 
on M. 
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We turn to the question of symbols. As with linear differential operators, which 
were discussed in Section 5, we cannot expect a coordinate-free meaning for the 
symbol of a pseudodifferential operator on the smooth manifold M, even if the 
operator is properly supported. But we can associate a “principal symbol” to 
such an operator in many cases, generalizing the result for differential operators 
in Section 5. For a linear differential operator of order m, we saw that the 
principal symbol is a smooth function on the cotangent bundle T*(M, R) that is 
homogeneous of degree m in each fiber. For a pseudodifferential operator whose 
order is not a nonnegative integer, the homogeneity may disrupt the smoothness at 
the origin of each fiber, and we thus have to allow for a singularity. Accordingly, 
let T*(M, R)* denote the cotangent bundle with the zero section removed, i.e., 
the closed subset consisting of the 0 element of each fiber is to be removed. The 
principal symbol of order m for a properly supported pseudodifferential operator 
G of order < m on M will turn out to be, in cases where it is defined, a smooth 
function on T*(M, R)* that is homogeneous of degree m in each fiber. 

Let G be a pseudodifferential operator of order < m on M, and let k be 
a compatible chart. Let G,(w) = G(yw o K)| M° x! be the corresponding 


generalized pseudodifferential operator on My ,and let ¢,(x, y, €) be an amplitude 
for it, so that 8 (x, y, €) isin Sst 0,0(Mx x M, ). Suppose that o,.(x, €) is asmooth 
functionon M « X (R" — {0}) that is homogeneous of degree m in the & variable for 
each fixed x in M,. The function o, (x, €) is not necessarily in St’ (M_.) because of 
the potential singularity at = 0, but the function t (€, (€))o, (x, €) isin S m (Mx) 
if t is asmooth scalar-valued function on R” that is 0 in a neighborhood of 0 and 
is 1 for |&| sufficiently large and if x +» €, is a smooth function from M,, into 
GL(n, F). Moreover, for any two choices of t and £, of this kind, the difference 
of the two symbols Tt (£;(&))o, (x, €) is the symbol of a smoothing operator. Fix 
such a t and £,. We say that G, has principal symbol o, (x, €) if there is some 
€ > 0 such that g,(x, y,€&) — t(€,(€))o, (x, &) is in sy 0.9(My x M,). This 
condition is independent of t and £,. We say that the given pseudodifferential 
operator G of order < m has a principal symbol, namely the family {o, (x, &)} 
as K varies, if this condition is satisfied for every « and if ¢ can be taken to be 
independent of x. 

In this case we shall show that {o, (x, €)} is the system of local expressions for 
a scalar-valued function on the part of the cotangent bundle of M where é 4 0, 
the dependence in the cotangent space being homogeneous of degree m at each 
point of M; consequently one refers also to this function on T*(M, R)* as the 
principal symbol. There is no assertion that a principal symbol exists, but it will 
be unique when it exists.!° Moreover, this definition agrees with the definition 


'©Some authors define the principal symbol more broadly — the local expression being the coset 
of amplitudes for G modulo amplitudes in Sto, (Me x M,). This alternative definition, however, 
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in Section 5 in the case of a linear differential operator on M. To see that the 
functions o,.(x, €) correspond to a single function on T*(M, R)*, suppose that « 
and «’ are compatible charts whose domains overlap. Let « = (x1,...,%,) and 

= (y1,..., ¥n). We write y = y(x) for the function x’ o «7! and x = x(y) 
for the inverse function « o x’/~'. Theorem 8.18 shows that there is no loss of 
generality in assuming that the local expressions for G in the charts « and x’ 
have symbols in Sj")(M,.) and Sj" Lo(Mx). Let these be g,(x, &) and gy (y, 7). 
Corollary 8.21 shows that 


8«'(¥. 1) — ge(x(y), (ZT ')") 


is in ST “acl (M, M M,)). Our construction shows that 


ge (¥. 2) — TNO’ (Ys 0) 
and 


g(x), (2) ')" a) - n(R2T Ym) ox (O, (RT) "n) 


are in Sto. (k'(M,. 1 M,/)). Therefore 
(2PM) ox(x), (42 ')"@) — a Mee(y, 0) 


is in STo a («'(M,. 0 M,:)) for e’ = min(1, ¢). For y fixed and |n| suincicntly 
large, each term in this expression has the property that its value at rn is r” times 
its value at 7 ifr > 1. Then the same thing is true of the difference. Since the 
condition of being in Sto e (k’(M, 1 M,/)) says that the absolute value of the 
difference at rn has to be < r™—€ times the absolute value of the difference at n, 
the difference has to be 0 for 7 sufficiently large. Therefore 


oc(x(y)s (ST ')"@) = oe) 


for y ink’(M,.0 M,’). According to a computation with T*(M) in Section 4, the 
family {o, (x, €)} satisfies the correct compatibility condition to be regarded as a 
scalar-valued function on T*(M, R)*. In short, we can treat the principal symbol 
as a scalar-valued function on the cotangent bundle minus the zero section. 


does not reduce to the definition made in Section 5 for linear differential operators, and it seems 
wise in the present circumstances to avoid it. 
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The pseudodifferential operator G on M is said to be elliptic of order m if its 
principal symbol is nowhere 0 on T*(M, R)*. It is a simple matter to check that 
ellipticity in this sense is equivalent to the condition that all the local expressions 
for the operator differ by smoothing operators!’ from operators that are elliptic 
of order m in the sense of Chapter VII. 

Theorem 7.24 extends from Euclidean space to separable smooth manifolds: 
any properly supported elliptic operator G has a two-sided parametrix, i.e., a 
properly supported pseudodifferential operator H having GH = 1 + smoothing 
and HG = 1+ smoothing. The proof consists of using Theorem 7.24 for each 
member of an atlas and patching the results together by a smooth partition of 
unity. A certain amount of work is necessary to arrange that the local operators 
are properly supported. We omit the details. 

As usual, the existence of the left parametrix implies a regularity result—that 
the singular support of Gf equals the singular support of f if f isin €'(M). 


8. Further Developments 


Having arrived at a point in studying pseudodifferential operators on manifolds 
comparable with where the discussion stopped for the Euclidean case, let us 
briefly mention some further aspects of the theory that have a bearing on parts of 
mathematics outside real analysis. 


1. Quantitative estimates. Much of the discussion thus far has concerned the 
effect of pseudodifferential operators on spaces of smooth functions of compact 
support, and rather little has concerned distributions. Useful investigations of 
what happens to distributions under such operators require further tools that 
distinguish some distributions from others. A fundamental such tool is the 
continuous family of Sobolev spaces denoted by H*, or more specifically by 
FAX m(M) or Hy,,(M), with s being an arbitrary real number. 

The starting point is the family of Hilbert spaces H*(R”) that were introduced 
in Problems 8-12 at the end of Chapter III. The space H*(R”) consists of all 
tempered distributions T € S(R”) whose Fourier transforms F(T) are locally 
square integrable functions such that ies |F(T)|° (1 + |E|7)° dé is finite, the norm 
| ||,,: being the square root of this expression. These spaces get larger as s 
decreases. For K compact in R”, let Hy be the vector subspace of all members 
of H*(R”) with support in K’; this subspace is closed and hence is complete. If U 
is open in R”, the space H.,,(U) is the union of all spaces Hy with K compact 


'7This condition takes into account Theorem 8.18, which says that the given operator differs by a 
smoothing operator from an operator with a symbol. If the local operator is defined by an amplitude 
and not a symbol, then ellipticity has not yet been defined for it. 
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in U, and it is given the inductive limit topology from the closed vector subspaces 
H;,. The space Hj,(U) is the space of all distributions T on U such that gT 
is in H3,,(U) for all g in C&(U); this space is topologized by the separating 
family of seminorms T +> ||gT'|| ,;;, and a suitable countable subfamily of these 
seminorms suffices. 

For U open in R”, it is a consequence of Theorem 5.20 that each member of 
€'(U) lies in H3,,,(U) for some s. There is no difficulty in defining H3.,,(M) 
and H,,.(M) for a separable smooth manifold M in a coordinate-free way, and 
the result persists that €’(M) is the union of all the spaces H3,,,(M) for s real. 

We have seen that any generalized pseudodifferential operator on M carries 
E'(M) into D’(M). The basic quantitative refinement of this result is that any 
generalized pseudodifferential operator of order < m carries H;,,,(M) continu- 


ously into H;.""(M). 


loc 


2. Local existence for elliptic operators. We have seen that a properly 
supported elliptic pseudodifferential operator on a manifold has a two-sided 
parametrix. The existence of the left parametrix implies the regularity result 
that the elliptic operator maintains singular support. With the aid of the Sobolev 
spaces in subsection (1), one can prove that the existence of a right parametrix 
for an elliptic differential operator L with smooth coefficients implies a local 
existence theorem for the equation L(u) = f. 


3. Pseudodifferential operators on sections of vector bundles. The the- 
ory presented above concerned pseudodifferential operators that mapped scalar- 
valued functions on a manifold into scalar-valued functions on the manifold. 
The first step of useful generalization is to pseudodifferential operators carrying 
vector-valued functions to vector-valued functions; these provide a natural setting 
for considering systems of differential equations. The next step of useful general- 
ization is to pseudodifferential operators carrying sections of one vector bundle to 
sections of another vector bundle. The prototype is the differential operator d on 
a manifold, which carries smooth scalar-valued functions to smooth differential 
1-forms. The latter, as we know from Section 4, are not to be considered as vector- 
valued functions on the manifold but as sections of the cotangent bundle. The 
ease of adapting our known techniques to handling the operator d in this setting 
illustrates the ease of handling the overall generalization of pseudodifferential 
operators to sections. In considering the equation df = 0, for example, we can 
use local coordinates and write df(p) = >); of (p) dx;(p), regarding aa asa 
coefficient function for a basis vector. If df = 0, then each coefficient must 
be 0. So the partial derivatives of f in local coordinates must vanish, and f 
must be constant in local coordinates. Thus we have solved the equation in local 
coordinates. When we pass from one local coordinate system to another, aligning 
the basis vectors dx; requires taking the bundle structure into account, but that is a 
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separate problem from understanding d locally. For a pseudodifferential operator 
carrying sections of one vector bundle to sections of another, the formalism 
is completely analogous. Locally we can regard the operator as a matrix of 
generalized pseudodifferential operators of the kind considered earlier in this 
section. One can introduce appropriate generalizations of the various notions 
considered in this section and work with them without difficulty. In particular, 
one can define principal symbol and ellipticity and can follow through the usual 
kind of theory of parametrices for elliptic operators, obtaining the usual kind of 
regularity result. In place of H3,,,(M) and H,,.(M), one works with spaces of 


sections H3,,,(M, E) and H,,.(M, E), E being a vector bundle. 


com 


4. Pseudodifferential operators on sections when the manifold is compact. 
Of exceptional interest for applications is the situation in subsection (3) above 
when the underlying smooth manifold is compact. Here every pseudodiffer- 
ential operator is of course properly supported, and the subscripts “com” and 
“loc” for Sobolev spaces mean the same thing. Three fundamental tools in 
this situation are the theory of “Fredholm operators,’ a version of Sobolev’s 
Theorem, saying that the members of H*(M, E) have k continuous derivatives 
ifs > [5 dim M] + k + 1, and Rellich’s Lemma, saying that the inclusion of 
H*‘(M, E) into H'(M, E) if t < s carries bounded sets into sets with compact 
closure. An important consequence is that the kernel of an elliptic operator of 
order m carrying H*(M, E) to H*~”(M, F) is finite dimensional, the dimension 
being independent of s; moreover, the image of H*(M, E) in H*~”(M, F) has 
finite codimension independent of s. The difference of the dimension of the kernel 
and the codimension of the image is called the index of the elliptic operator and 
plays a role in subsection (5) below. 


5. Applications of the theory with sections over a compact manifold //. 
In this discussion we shall freely use some terms that have not been defined in the 
text, putting many of them in quotation marks or boldface at their first occurrence. 


5a. A prototype of the theory of subsection (4) is Hodge theory, which 
involves “higher-degree differential forms.’ The operator d carries smooth forms 
of degree k to smooth forms of degree k + 1, hence is an operator from sections 
of one vector bundle to sections of another. If M is Riemannian, then the space 
of differential forms of each degree acquires an inner product, and there is a 
well-defined Laplacian dd* + d*d carrying the space of forms of each degree 
into itself. Forms annihilated by this Laplacian are called harmonic. Roughly 
speaking, the theory shows that the kernel of d on the space of forms of degree 
k is the direct sum of the harmonic forms of degree k and the image under d of 
the space of forms of degree k — 1. Consequently “de Rham’s Theorem” allows 
one to identify the space of harmonic forms with the cohomology of M with 
coefficients in the field of scalars F. 
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5b. For any complex manifold M, there is an operator 0 on smooth differential 
forms that plays the same role for the partial derivative operators im that d plays 
J 


for the operators ae The same kind of analysis as in subsection (5a), when done 
for a compact complex manifold with a Hermitian metric and a Laplacian of the 
form 00* + 0*9, identifies, roughly speaking, a suitable space of harmonic forms 
as a vector-space complement to the image of 4 in a kernel for 3. 


5c. Fora Riemann surface M ,a holomorphic-line-bundle version of subsection 
(5b) leads to a proof!® of the Riemann—Roch Theorem, a result allowing one 
to compute the dimensions of various spaces of meromorphic sections on the 
Riemann surface. For a compact complex manifold a holomorphic-vector-bundle 
version of subsection (5b) leads to Hirzebruch’s generalization of the Riemann— 
Roch Theorem. 


5d. In place of d or 4, one may use a version of a “Dirac operator” in the above 
kind of analysis. The result is one path that leads to the Atiyah-Singer Index 
Theorem, which relates a topological formula and an analytic formula for the 
index of an elliptic operator from sections of one vector bundle over the compact 
manifold to sections of another such bundle. This theorem has a number of 
applications relating topology and analysis, and the Hirzebruch-Riemann—Roch 
Theorem may be regarded as a special case. 


BIBLIOGRAPHICAL REMARKS. There are several books on pseudodifferential 
operators, and the treatment here in Chapters VII and VIII has been influenced 
heavily by three of them: H6rmander’s Volume IT] of The Analysis of Linear Par- 
tial Differential Equations, Taylor’s Pseudodifferential Operators, and Treves’s 
Volume 1 of Introduction to Pseudodifferential and Fourier Integral Operators .'° 

All three books use the definition FE ee ee f (x)e7*§ dx for the Fourier 
transform, where c = 1 for Hormander and Treves and c = (277)~”/? for Taylor. 
The definition here is f(€) = fp, f (x)e~°""""* dx; this change forces small dif- 
ferences in the constants involved in the definition of pseudodifferential operators 
and results like Theorems 7.22 and 8.17. Another difference in notation is that 
these books include a power of i = ./—1 in the definition of D%, and this text 
does not; inclusion of the power of i follows a tradition dating back to the work 
of Hermann Weyl] and seems an unnecessary encumbrance at this level. 

The books by Hérmander and Treves assume extensive knowledge of material 
in separate books by the authors concerning distributions; Taylor makes extensive 
use of distributions and includes a very brief summary of them in Chapter I. Treves 


'8Not the standard proof. 
!°Rull references for these books and other sources may be found in the section References at 
the end of the book. 
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uses a smooth measure on a manifold in order to identify smooth functions with 
distributions,2° but Hérmander does not. 

The relevant sections of those books for the material in Sections VII.6, VIII.6, 
and VIII.7 are as follows: Section 18.1 of H6rmander’s book, Sections II.1—II.5 
and III.1 of Taylor’s book, and Sections I.1—I.5 of the Treves book. 

The relevant portions of the three books for the mathematics in Section VIII.8 
include the following: (1) Hérmander, pp. 90-91, Taylor, Section II.6; Treves, 
pp. 16-18 and 47. (2) Taylor, Section VI.3; Treves, pp. 92-93. (3) Hérmander, 
pp. 91-92; Treves, Section I.7. (4) Hérmander, Chapter XIX; Treves, Section 
11.2. 

A larger number of books use pseudodifferential operators for some particular 
kind of application, sometimes developing a certain amount of the abstract theory 
of pseudodifferential operators. Among these are Wells, Differential Analysis on 
Complex Manifolds, which addresses applications (5a), (5b), and (5c) above; 
Lawson-Michelsohn, Spin Geometry, which addresses application (5d) above; 
and Stein, Harmonic Analysis: Real-Variable Methods, Orthogonality, and Os- 
cillatory Integrals, which uses pseudodifferential operators to study the behavior 
of holomorphic functions on the boundaries of domains in C”, as well as related 
topics. Hérmander’s book is another one that addresses application (5d), but it 
does so less completely than Lawson—Michelsohn. 

For a brief history of pseudodifferential operators and the relationship of 
the theory to results like the Calderén—Zygmund Theorem, see Hérmander, 
pp. 178-179. For more detail about how pseudodifferential operators capture 
the idea of a freezing principle, see Stein, pp. 230-231. 


9. Problems 


1. Verify that the unit sphere M = S” in R"*!, the set of vectors of norm 1, can 
be made into a smooth manifold of dimension n by using two charts defined as 
follows. One of these charts is 


QA, c ++ Xng1) = (Gh seas a) 
with domain M,, = S" — {(0,...,0, 1)}, and the other is 
2001 tee) = (Te TH) 


with domain M,, = S” — {(0,...,0, —1)}. 


20For a while, anyway. 
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Set-theoretically, the real n-dimensional projective space M = RP" can be 
defined as the result of identifying each member x of S” in the previous problem 
with its antipodal point —x. Let [x] € RP” denote the class of x € S”. 

(a) Show that d([x], [y]) = min{|x — y|, |x + y|} is well defined and makes 
RP" into metric space such that the function x t+ [x] is continuous and 
catries open sets to open sets. 

(b) For each j with 1 < j <n +1, define 


X1 Xj-1 Xj4+1 Xn+1 
Cie Seal a a ) 
xj xj xj xj 
on the domain M,, = {[@1, -e +s Xn41)] | xj F Oo}. Show that the system 


{Kj | l<j<nt+ 1} is an atlas for RP” and that the function x > [x] from 
S" to RP” is smooth. 


Let X be a smooth manifold. 

(a) Prove that if X is Lindel6f, or is o-compact, or has a countable dense set, 
then X has an atlas with countably many charts. 

(b) Prove that if X has an atlas with countably many charts, then X is separable. 


The real general linear group G = GL(n, R) is the group of invertible n-by-n 

matrices with entries in R, the group operation being matrix multiplication. The 

space of all n-by-n real matrices A may be identified with R” , and GL(n, R) 

is then the open set where det A # 0. As an open subset of IR”, it is a smooth 

manifold with an atlas consisting of one chart. The coordinate functions x;;(g) 

yield the entries g;; of g. 

(a) Prove that matrix multiplication, as a mapping of G x G into G, is a smooth 
mapping. Prove that matrix inversion, as a mapping from G into G, is 
smooth. 

(b) If A is a matrix with entries A;;, identify A as a member of 7,(G) by A <> 
ij Aijl ae |: Let /, be the diffeomorphism of G given by /,(h) = gh. 
Define a vector field A by Ay f = (dle): (A)(f) if f is defined near g. Prove 
that Agf = Di. (8A): FE (8). 

(c) Prove that A is smooth and is left invariant in the sense of being carried to 
itself by all /,’s. 

(d) Show that c(t) = go exp? A is the integral curve for A such that c(0) = go. 

(e) Prove that if f is in C(G), then Af (g) = £ f(g exptX) [ae 


This problem concerns the existence of smooth partitions of unity on a separable 
smooth manifold M. Let {K;})>; be an exhausting sequence for M. For/ = 0, put 
Lo = Kz and Up = K3. For! > 1, put L; = Li42 — Kp, and U; = Kp.3 — Kj. 
Each point of M lies in some L; and has a neighborhood lying in only finitely 
many U;’s. 
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(a) Using the exhausting sequence, find an atlas {k,} of compatible charts such 
that each point of M has a neighborhood lying in only finitely many M,,’s. 

(b) By applying Proposition 8.2 within each member of a suitable atlas as in 
(a), show that there exists ny € C&,(M,,) for each a with values in [0, 1] 
such that )> nq is everywhere > 0. Normalizing, conclude that there exists 
ga € C&(M,,) for each a with values in [0,1] such that 5° gy is 1 
identically on M. 

(c) Prove that if K is compact in M and U is open with K C U, then there 

exists g in C&S,,(U) with values in [0, 1] such that g is 1 everywhere on K. 

(d) Prove that if K is compact in M and {U),..., U,} is a finite open cover of 
K, then there exist g; in C&,,(U;) for 1 < j <r with values in [0, 1] such 
that }°_; g; is lon K. 


Problems 6-7 concern local coordinate systems on smooth manifolds. 


6. 


Let M and N be smooth manifolds of dimensions 7 and k, let p be in M, 
suppose that F : M — N isasmooth function such that dF, carries T, (M) onto 
Tr (p)(N), and suppose that A is a compatible chart for N about F'(p) such that 
X= ()1,---, Yk). Prove that the functions yj o F,..., yg o F can be taken as 
the first k of n functions that generate a system of local coordinates near p in the 
sense of Proposition 8.4. 


Let M and N be smooth manifolds of dimensions n and k, let p be in M, suppose 

that F : M — N is a smooth function such that dF, is one-one, and suppose 

that yy = (1,.-.-., yx) is acompatible chart for N about F(p). 

(a) Prove that it is possible to select from the set of functions yj o F,..., yyoF 
a subset of 1 of them that generate a system of local coordinates near F (p) 
in the sense of Proposition 8.4. 

(b) Let g = (%1,...,X,) be a compatible chart for M about p. Prove that 
there exists a system of local coordinates (z1,..., Z,) near F(p) such that 
x; coincides in a neighborhood of p with zj o F forl < j <n. 


Problems 8—9 concern extending Sard’s Theorem (Theorem 6.35 of Basic) to sep- 
arable smooth manifolds. Let M be an n-dimensional separable smooth manifold, 
and let {«,} be an atlas of charts. A subset S of M has measure 0 if k.(S 9 Mz) 
has n-dimensional Lebesgue measure 0 for all a. If F : M — N is a smooth map 
between smooth n-dimensional manifolds M and N , acritical point p of F is a point 
where the differential (dF), has rank < n. In this case, F(p) is called a critical 
value. 


8. 


Prove that if F : M — N is a smooth map between two smooth separable 
n-dimensional manifolds M and N, then the set of critical values of F has 
measure 0 in NV. 


Prove that if F : M — N is a smooth map between two separable smooth 
manifolds and if dim M < dim N, then the image of F has measure 0 in NV. 


9. Problems 373 


Problems 10-13 introduce equivalence of vector bundles, which is the customary 
notion of isomorphism for vector bundles with the same base space. Let 7 : B > M 
and zr’ : B’ + M be two smooth coordinate vector bundles of the same rank n with 
the same field of scalars and same base space M, but with distinct bundle spaces, 
distinct projections, possibly distinct atlases A = {«;} and A’ = {«;,} for M, distinct 
coordinate functions @; and ¢,, and distinct transition functions gj,(x) and g;,(x). 
Leth: B — B’ bea fiber-preserving smooth map covering the identity map of M, 
i.e., a smooth map such that h(a—!(x)) = a’~!(x) for all x in M. For each x in M, 
define h, to be the smooth map obtained by restriction hy = bles (x)? this carries 
m—!(x) to w’—!(x). Say that A exhibits 7 : B — M andz’ : B’ > M as equivalent 
coordinate vector bundles if the following two conditions are satisfied: 


e whenever x; and «;, are charts in A and A’ about a point x of M, then the map 
= -1 
&xj (x) = Px ° hy ° Pj,x 


of F” into itself coincides with the operation of a member of GL(n, F), 
e the map gj > My, 9 Me > GL(n, F) is smooth. 


The functions x +> g;;(x) will be called the mapping functions of h. 


10. Prove for coordinate vector bundles that “equivalent” is reflexive and transitive 
and that strictly equivalent implies equivalent. 


11. Prove that if h exhibits two coordinate vector bundles 7 : B — M and 
a’ : B' — M as equivalent, then the mapping functions x +> g&j(x) of h 
satisfy the conditions 

Buj (X) Bi (X) = Bei (x) for x € MO Me Mg, 
Big (X) BKj(X) = g(x) forx € Me Mx: (a) My. 

12. Suppose that 7 : B — M and z’: B' > M are two smooth coordinate vector 
bundles of the same rank n with the same field of scalars relative to atlases 
A = {xj} and A’ = {k,} of M. 

(a) If smooth functions x +> gxj(x) of M.; O Me: into GL(n, F) are given that 
satisfy the displayed conditions in Problem 11, prove that there exists at 
most one equivalence h : B — B’ of coordinate vector bundles having {g,;} 
as mapping functions and that it is given by h(¢;,x(y)) = Pk Bki (x)(y). 

(b) Prove that “equivalent” for coordinate vector bundles is symmetric, and con- 
clude that “equivalent” is an equivalence relation whose equivalence classes 
are unions of equivalence classes under strict equivalence. (Educational 
note: Therefore the notion of equivalent vector bundles is well defined.) 


13. Suppose that 7 : B — M and z’: B’ > M are two smooth coordinate vector 
bundles of the same rank n with the same field of scalars relative to atlases 
A = {xj} and A’ = {x;,} of M, and suppose that smooth functions x b> gz; (x) 
of M,,  M,: into GL(n, F) are given that satisfy the displayed conditions in 
Problem 11. 
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(a) 


(b) 
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Define a smooth mapping h;; from x! (MO M.) in Btox'! (MO M.) 
as follows: Ifbisin B with x = 7(b)inM,, OM , let 7) (b) = oj). (b) € F’, 
and set 

hyj(b) = bj x8kj (X)(pj(b)). 


Prove that {h,;} is consistently defined as one moves from chart to chart, 
ie., that if x lies also in M,, Mx, then hy;(b) = hj;(b), and conclude that 
the functions h;; piece together as a single smooth function h : B > B’. 
Prove that the functions x +> g,j;(x) coincide with the mapping functions 
of h, and conclude that the existence of functions satisfying the displayed 
conditions in Problem 11 is necessary and sufficient for equivalence. 


CHAPTER IX 


Foundations of Probability 


Abstract. This chapter introduces probability theory as a system of models, based on measure 
theory, of some real-world phenomena. The models are measure spaces of total measure | and 
usually have certain distinguished measurable functions defined on them. 

Section 1 begins by establishing the measure-theoretic framework and a short dictionary for 
passing back and forth between terminology in measure theory and terminology in probability theory. 
The latter terminology includes events, random variables, expectation, distribution of a random 
variable, and joint distribution of several random variables. An important feature of probability is 
that it is possible to work with random variables without any explicit knowledge of the underlying 
measure space, the joint distributions of random variables being the objects of importance. 

Section 2 introduces conditional probability and uses that to motivate the mathematical definition 
of independence of events. In turn, independence of events leads naturally to a definition of 
independent random variables. Independent random variables are of great importance in the subject 
and play a much larger role than their counterparts in abstract measure theory. 

Section 3 states and proves the Kolmogorov Extension Theorem, a foundational result allowing 
one to create stochastic processes involving infinite sets of times out of data corresponding to finite 
subsets of those times. A special case of the theorem provides the existence of infinite sets of 
independent random variables with specified distributions. 

Section 4 establishes the celebrated Strong Law of Large Numbers, which says that the Cesaro 
sums of a sequence of identically distributed independent random variables with finite expectation 
converge almost everywhere to the expectation. This is a theorem that is vaguely known to the 
general public and is widely misunderstood. The proof is based on Kolmogorov’s inequality. 


1. Measure-Theoretic Foundations 


Although notions of probability have been around for hundreds of years, it was 
not until the twentieth century, with the introduction of Lebesgue integration, that 
the foundations of probability theory could be established in any great generality. 
The early work on foundations was done between 1929 and 1933 chiefly by A. N. 
Kolmogorov and partly by M. Fréchet. 

First of all, the idea is that probability theory consists of models for some 
experiences in the real world. Second of all, these experiences are statistical in 
nature, involving repetition. Thus one attaches probability 1/2 to the outcome 
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of “heads” for one flip of a standard coin based on what has been observed over 
a period of time. One even goes so far as to attach probabilities to outcomes 
that one can think of repeating even if they cannot be repeated as a practical 
matter, such as the probability that a particular person will die from a certain kind 
of surgery. But one does not try to incorporate probabilities into the theory for 
contingencies that cannot remotely be regarded as repeatable. The philosopher 
R. Carnap has asked, “What is the probability that the fair coin I have just tossed 
has come up ‘heads’?” He would insist that the answer is O or 1, certainly 
not 1/2. Mathematical probability theory leaves his question as something for 
philosophers and does not address it. 

The initial situation that is to be modeled is that of an experiment to be 
performed; the experiment may be really simple, as with a single coin toss, 
or it may have stages to it that may or may not be related to each other. For the 
moment let us suppose that the number of stages is finite; later we shall relax 
this condition. To fix the ideas, let us think of the outcome as a point in some 
Euclidean space. Forcing the outcome to be a point in a Euclidean space may 
not at first seem very natural for a single toss of a coin, but we can, for example, 
identify “heads” with 1 and “tails” with 0 in R!. In any case, the experiment has 
a certain range of conceivable outcomes, and these outcomes are to be disjoint 
from one another. Initially we let Q be the set of these conceivable outcomes. If 
an outcome occurs when conditions belonging to a set A are satisfied, one says 
that the event A has taken place. 

We imagine that probabilities have somehow been attached to the individual 
outcomes, and to aggregates of them, on the basis of some experimental data. Us- 
ing a frequency interpretation of probability, one is led to postulate that probability 
in the model of this experiment is a nonnegative additive set function on some 
system of subsets of Q that assigns the value 1 to Q itself. Without measure theory 
as ahistorical guide, one might be hard pressed to postulate complete additivity as 
well, but in retrospect complete additivity is not a surprising condition to impose. 

At any rate, the model of the experiment within probability theory uses a 
measure space (Q,.A, P), normally with total measure P(Q) equal to 1, with 
one or more measurable functions on & to indicate the result of the experiment. 
One way of setting up (Q,.A, P) is as we just did—to let Q be the set of all 
possible outcomes, i.e., all possible values of the measurable functions that give 
the result of the experiment. Events are then simply measurable sets of outcomes, 
and the measure P gives the probabilities of various sets of outcomes. Yet this 
is not the only way, and successful work in the subject of probability theory 
requires a surprising indifference to the nature of the particular Q used to model 
a particular experiment. 

We can give a rather artificial example right now, in the context of a single 
toss of a standard coin, of how distinct Q’s might be used to model the same 
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experiment, and we postpone to the last two paragraphs of this section and to 
the proof of Theorem 9.8 any mention of more natural situations in which one 
wants to allow distinct (2’s in general. The example occurs when the experiment 
is a single flip of a standard coin. Let us identify “heads” with the real number 1 
and “tails” with the real number 0. Centuries of data and of processing the data 
have led to a consensus that the probabilities are to be 1/2 for each of the two 
possible outcomes, 1 and 0. We can model this situation by taking © to be the set 
{1, 0} of outcomes, A to consist of all subsets of Q, and P to assign weight 1/2 
to each point of Q. The function f indicating the result of the experiment is the 
identity function, with f(@) = 1 if m = 1 and with f(w) = Oif wm = 0. But it 
would be just as good to take any other measure space (Q, A, P) with P(Q) = 1 
and to suppose that there is some measurable subset A with P(A) = 1/2. The 
measurable function f modeling the experiment has f(w) = 1 if w is in A and 
f(@) = 0 if not. 

The problem of how to take real-world data and to extract probabilities in 
preparation for defining a model is outside the domain of probability theory. This 
involves a statistical part that obtains and processes the data, identifies levels of 
confidence in the accuracy of the data, and assesses the effects of errors made in 
obtaining the data accurately. Also it may involve making some value judgments, 
such as what confidence levels to treat as decisive, and such value judgments are 
perhaps within the domain of politicians. In addition, there is a fundamental 
philosophical question in whether the model, once constructed, faithfully reflects 
reality. This question is similar to the question of whether mathematical physics 
reflects the physics of the real world, but with one complication: in physics 
there is always the possibility that a single experimental result will disprove the 
model, whereas probability gives no prediction that can be disproved by a single 
experimental result. 

Apart from a single toss of a coin, another simple experiment whose outcome 
can be expressed in terms of a single real number is the selection of a “random” 
number from [0, 2]. The word “random” in this context, when not qualified in 
some way, insists as a matter of definition that the experiment is governed by 
normalized Lebesgue measure, that the probability of picking a number within a 
set A is the Lebesgue measure of A divided by the Lebesgue measure of [0, 2]. If 
we take Q to be [0, 2], A to be the Borel sets, and P to be 5 dx and if we use the 
identity function as the measurable function telling the outcome, then we have 
completely established a model. 

The theory needed for setting up a model that incorporates given probabilities 
is normally not so readily at hand, since one is quite often interested potentially in 
infinitely many stages to an experiment and the given data concern only finitely 
many stages at a time. In many cases of this kind, one invokes a fundamental 
theorem of Kolmogorov to set up a measure space that can allow the set of 
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distinguished measurable functions to be infinite in number. We shall state and 
prove this theorem in Section 3. 

In the meantime let us take the measure space (Q,.A, P) with P(Q) = 1 as 
given to us. We refer to (Q,.A, P) or simply (Q, P) as a probability space. 
Probability theory has its own terminology. An event is a measurable set, thus a 
set in the o-algebra A. One speaks of the “probability of an event,” which means 
the P measure of the set. The language used for an event is often slightly different 
from the ordinary way of defining a set. With the random-number example above, 
one might well speak of the probability of the “event that the random number lies 
in [1/2, 1]” when a more literal description is that the event is [1/2, 1]. It is not 
a large point. The probability in either case, of course, is 1/4. 

Let A and B be events. The event AM B is the simultaneous occurrence of A 
and B. The event A U B is the event that at least one of A and B occurs. The 
event A° is the nonoccurrence of the event A. If A = @, event A is impossible; if 
A = &, event A must occur. Containment B C A means that from the occurrence 
of event B logically follows the occurrence of event A. Two events A and B are 
incompatible if AM B = ©. A set-theoretic partitioning C of Q as a disjoint 
union Q = |_J_, Ag corresponds to an experiment C consisting of determining 
which of the events A;,..., A, occurs. And so on. 

A random variable is a real-valued measurable function on &. With the 
random-number example, a particular random variable is the number selected. 
This is the function f that associates the real number w to the member w of the 
space Q2. The word “random” in the name “random variable” refers to the fact that 
its value depends on which possibility in Q is under consideration. Some latitude 
needs to be made in the definition of measurable function to allow a function 
taking on values “heads” and “tails” to be a random variable, but this point will 
not be important for our purposes.! As we shall see, the random variables that 
yield the result of the defining experiment of a probability model are, in a number 
of important cases, coordinate functions on a set Q given as a product, and random 
variables are often indicated by letters like x suitable for coordinates.” 

The expectation or expected value F (x) of the random variable x is motivated 
by a computation in the especially simple case that Q contains finitely many out- 
comes/points and P(A) is computed for an event by adding the weights attached 
to the outcomes w of A. If w is an outcome, the value of x at w is x(w), and 
this outcome occurs with probability P({@}). Summing over all outcomes, we 


'We return to this point in Section 3, where it will influence the hypotheses of the fundamental 
theorem of Kolmogorov. 

7In his book Measure Theory Doob writes on p. 179, “An attentive reader will observe ... that 
in other chapters a function is f or g, and so on, whereas in this chapter [on probability] a function 
is more likely to be x or y, and so on, at the other end of the alphabet. This difference is traditional, 
and is one of the principal features that distinguishes probability from the rest of measure theory.” 
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obtain eg x(@) P ({o}) as a reasonable notion of the expected value. This sum 
suggests a Lebesgue integral, and accordingly the definition in the general case is 
that E(x) = i x(w)dP(w). Probabilists say that E(x) exists if x is integrable; 
cases in which the Lebesgue integral exists and is infinite are excluded. 

There is a second way of computing expectation. When Q is a finite set as 
above, we can group all the terms in )),..9 x(w)P({w}) for which x(w) takes 
a particular value c and then sum on c. The regrouped value of the sum is 
>>. cP ({@ | x(w) = c}). The corresponding formula in the general case involves 
the distribution of x, the Stieltjes measure 44, on the Borel sets of the line R 
defined by? 

Mx(A) = P({w € Q| x(@) € A}). 


This measure has total mass ,(R) = P(&2) = 1. The notion of j,, but not the 
name, was introduced in Section VI.10 of Basic. The formula for expectation in 
terms of the distribution of x is E(x) = ihe x dtx; the justification for this formula 
lies in the following proposition, which was proved in Basic as Proposition 6.56a 
and which we re-prove here. 


Proposition 9.1. If x : & — R is a random variable on a probability space 
(Q, P) and if j, is the distribution of x, then 


[ecware= [oman 
Q R 


for every nonnegative Borel measurable function ® : R — R. The formula 
extends to the case in which the condition “nonnegative” on ® is dropped if the 
integrals for 6+ = max(®, 0) and ®~ = — min(®, 0) are both finite. 


PROOF. When ® is the indicator function J, of a Borel set A of R, the two 
sides of the identity are P (x—!(A)) and j1,(A), and these are equal by definition 
of 44x. We can pass to nonnegative simple functions by linearity and then to 
general nonnegative Borel measurable functions ® by monotone convergence. 


The qualitative conclusion of Proposition 9.1 is by itself important: the ex- 
pectation of any function of a random variable can be computed in terms of the 
distribution of the random variable — without reference to the underlying measure 
space Q2. 

The expression for E (x) arising from Proposition 9.1 can often be written as a 
“Stieltjes integral,’ which is a simple generalization of the Riemann integral, and 
thus the proposition in principle gives a way of computing expectations without 
Lebesgue integration.4 


3Naturally this notion of distribution is not to be confused with the kind in Chapter V. 
4Consequently the resulting formula for expectations is handy pedagogically and is often ex- 
ploited in elementary probability books. 
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Although this book does not adhere to the practice, many probabilists prefer to 
work with the associated monotone function for the Stieltjes measure jz,, rather 
than the measure itself. They refer to this monotone function as the distribution 
function of x, whereas Basic would call it the distribution function of w,. When 
the monotone function is absolutely continuous (for example, when it has a 
continuous derivative), its derivative is called the density of the random variable 
x. Ifx hasadensity f,, the formula for expectation becomes E(x) = te tfx(t) dt. 

A set of random variables is said to be identically distributed if all of them 
have the same Stieltjes measure as distribution. We shall make use of identically 
distributed random variables in Section 4. 

Let us examine the formula in Proposition 9.1 more closely. The integral on 
the left side is the expectation of the random variable ® o x, but the integral on 
the right side is not the usual integral for an expectation. We therefore obtain the 
identity 


if O(t) dux(t) = i 5d Wegox(S), 
R R 


which is a kind of change-of-variables formula for random variables. 

Although Proposition 9.1 allows us to compute the expectation of any Borel 
function of a random variable in terms of the distribution of the random variable, 
it does not help us when we have to deal with more than one random variable. The 
appropriate device for more than one random variable is a “joint distribution.” If 
X1,...,Xy are random variables, define, for each Borel set A in R™, 


sey 


Then (4... xy 18 a Borel measure on R™ with 2, 

joint distribution of x,, ..., x). Referring to the definition, we see that we can 

obtain the joint distribution of a subset of x1, ...,x, by dropping the relevant 

variables: for example, dropping xy enables us to pass from the joint distribution 

of x1,..., Xa to the joint distribution of x1, ..., x—1, the formula being 
My, xy 1 (B) = Myy,.ry (BX R). 


Febes 


Proposition 9.2. If x;,..., x, are random variables on a probability space 
(Q, P) and if Vegas is their joint distribution, then 


x 


5 tees 


[eci@).....2v@aP@ = | DUiss.. jin) eiy, 4, Giness7 ty) 
Q RN 


for every nonnegative Borel measurable function @ : RY — R. The formula 
extends to the case in which the condition “nonnegative” on ® is dropped if the 
integrals for 6+ = max(®, 0) and ®~ = — min(®, 0) are both finite. 
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PRrooF. In (a), when © is the indicator function J, of a Borel set A of RY, the 
two sides of the identity are P((x;,..., xy) !(A)) and bey ay ADs and these 
are equal by definition of ,, __,,,- We can pass to nonnegative simple functions 
by linearity and then to general nonnegative Borel measurable functions ® by 


monotone convergence. 


As with Proposition 9.1, the qualitative conclusion of Proposition 9.2 is by 
itself important: the expectation of any function of N random variables can be 
computed in terms of their joint distribution — without reference to the underlying 
measure space (2. For example the product of the N random variables is a function 
of them, and therefore 


peso, 


The possibility of making such computations without explicitly using Q has the 
effect of changing the emphasis in the subject. Often it is not that one is given such- 
and-such probability space and such-and-such random variables on it. Instead, 
one is given some random variables and, if not their precise joint distribution, at 
least some properties of it. Accordingly, we can ask, What Borel measures ju 
on RY with w(R%) = 1 are joint distributions of some family x;,...,xy of N 
random variables on some probability space (Q, P)? 

The answer is, all Borel measures jz with (IR”) = 1. In fact, we have only to 
take (Q, P) = (R%, w) and let x; be the j* coordinate function Xj(@1,..., @n) 
= w; on R. Substituting into the definition of joint distribution, we see that the 


galety 


ry (A) = W({o € RY | (x1), ..., xv (@)) € A}) 
= u({o € RY | (o,...,@y) € A}) = WA). 


jess 


xy equals the given measure ju. 


Sees 


2. Independent Random Variables 


The notion of independence of events in probability theory is a matter of definition, 
but the definition tries to capture the intuition that one might attach to the term. 
Thus one seeks a mathematical condition saying that a set of attributes determining 
a first event has no influence on a second event and vice versa. Kolmogorov 
writes,> 


In his Foundations of the Theory of Probability, second English edition, pp. 8-9. 
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Historically, the independence of experiments and random variables 
represents the very mathematical concept that has given the theory 
of probability its peculiar stamp. The classical work of LaPlace, 
Poisson, Tchebychev, Liapounov, Mises, and Bernstein is actually 
dedicated to the fundamental investigation of series of independent 
random variables. ... We thus see, in the concept of independence, at 
least the germ of the peculiar type of problem in probability theory... . 
In consequence, one of the most important problems in the philosophy 
of the natural sciences is — in addition to the well-known one regarding 
the essence of the concept of probability itself—to make precise the 
premises which would make it possible to regard any given real events 
as independent. 


The path to discovering the mathematical condition that captures independence 
of events begins with “conditional probability.” Let A and B be two events, and 
assume that P(B) > 0. Think of A as a variable. The conditional probability of 
A given B, written P(A | B),is to be anew probability measure, as A varies, and 
is to be a version of P adjusted to take into account that B happens. These words 
are interpreted to mean that a normalization is called for, and the corresponding 
definition is therefore 
P(AN B) 

P(B) 

In measure-theoretic terms, we pass from the measure space (Q2,.A, P) to 
the measure space (B »>ANB, P((-) NB) / P(B)). Conditional probabilities 
P(A | B) are left undefined when P(B) = 0. 

The intuition concerning independence of A and B is that the occurrence of B 
is not to influence the probability of A. Thus two events A and B are to be inde- 
pendent, at least when P(B) > 0,if P(A) = P(A | B). This condition initially 
looks asymmetric, but if we substitute the definition of conditional probability, 


we find that the condition is P(A) = Sa ) , hence that 


P(AN B) = P(A)P(B). 


P(A|B)= 


This condition is symmetric, and it allows us to drop the assumption that 
P(B) > 0. We therefore define the events A and B to be independent if 
P(AN B) = P(A)P(B). 

As the quotation above from Kolmogorov indicates, the question whether this 
definition of independence captures from nature our intuition for what the term 
should mean is a deep fundamental problem in the philosophy of science. We 
shall not address it further. 

But a word of caution is appropriate. The assumption of mathematical inde- 
pendence carries with it far-reaching consequences, and it is not to be treated 
lightly. Members of the public all too frequently assume independence without 
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sufficient evidence for it. Here are two examples that made national news in 
recent years. 


EXAMPLES. 


(1) In the murder trial of a certain sports celebrity, a criminalist presented 
evidence that three characteristics of some of the blood at the scene matched the 
defendant’s blood, and the question was to quantify the likelihood of this match 
if the defendant was not the murderer. Two of the three characteristics amounted 
to the usual blood type and Rh factor, and the criminalist said that half the people 
in the population had blood with these characteristics. The third characteristic 
was something more unusual, and he asserted that only 4% of the population had 
blood with this characteristic. He concluded that only 2% of the population had 
blood for which these three characteristics matched those in the defendant’s blood 
and the blood at the scene. The defense attorney jumped on the criminalist, asking 
how he arrived at the 2% figure, and received a confirmation that the criminalist 
had simply multiplied the probability .5 for the blood type and Rh factor by the 
.04 for the third characteristic. Upon being questioned further, the criminalist 
acknowledged that he had multiplied the probabilities because he could not see 
that these characteristics had anything to do with each other. The defense attorney 
elicited a further acknowledgement that the criminalist was aware of no studies 
of the joint distribution. The criminalist’s testimony was thus discredited, and the 
jurors could ignore it. What the criminalist could have said, but did not, was that 
anyway at most 4% of the population had blood with those three characteristics 
because of that third characteristic alone; that assertion would not have required 
any independence. 


(2) In the 2004 presidential election, some malfunctions involving electronic 
voting machines occurred in three states in a particular way that seemed to favor 
one of the two main candidates. One national commentator who pursued this story 
rounded up an expert who examined closely what happened in one of the states 
and came up with a rather small probability of about .1 for the malfunction to have 
been a matter of pure chance. Seeing that the three states were widely separated 
geographically and that communication between officials of the different states 
on Election Day was unlikely, the commentator apparently concluded in his mind 
that the three events were independent. So he multiplied the probabilities and 
announced to the public that the probability of this malfunction in all three states 
on the basis of pure chance was a decisively small 001. What he ignored was 
that the machines in the three states were all made by the same company; so the 
assumption of independence was doubtful. 


Of more importance for our purposes than independence of events is the notion 
of independence of random variables. Tentatively let us say that two random 
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variables x and y on a probability space (Q, P) are defined to be independent 
if {x(w) € A} and {y(w) € B} are independent events for every pair of Borel 
subsets A and B of R. Substituting the definition of independent events, we see 
that the condition is that 


P({@ | (x(@), y(@)) € A x B}) = P({w | x(@) € A})P({o | y@) € B}) 


for every pair of Borel subsets of R. We can rewrite this condition in terms of 
distribution functions as 


Hyx,y(A x B)= [x (A)My(B). 


In other words, the measure jz, on R? agrees with the product measure ji, x [ly 
on measurable rectangles. The two measures must then agree on all Borel sets 
of R?. Conversely if the two measures agree on all Borel sets of R?, then they 
agree on all measurable rectangles. We therefore adopt the following definition: 
two random variables x and y on a probability space (Q, P) are independent 
if their joint distribution is the product of their individual distributions, i.e., if 
Mx,y = Ux X My. 

One can go through a similar analysis, starting from conditional probability 
involving N events, and be led to a similar result for N random variables. The 


upshot is that N random variables x;,...,x, on a probability space (Q, P) 
are defined to be independent if their joint distribution yw, ,,, is the N-fold 
product of the individual distributions 1,,,..., ,,- An infinite collection of 


random variables is said to be independent if every finite subcollection of them 
is independent. 

We can ask whether arbitrarily large finite numbers of independent random 
variables exist on some probability space with specified distributions, and the 
answer is “yes.” This question is a special case of the one at the end of Section 1. 
If we are given N Borel measures j4;,..., 4 on R and we seek independent 
random variables with these measures as their respective individual distributions, 
we form the product measure “ = (41 X --- X ny. Then the observation at the 
end of Section 1 shows us that if we take (IR¥, 1) as a probability space and if 
we define N random variables on R™ to be the N coordinate functions, then the 
random variables have ju as joint distribution. Since py is a product, the random 
variables are independent. 

The question is more subtle if asked about infinitely many independent random 
variables. If, for example, we are given an infinite sequence of Borel measures 
on R, we do not yet have tools for obtaining a probability space with a sequence 
of independent random variables having those individual distributions.° We can 


There is one trivial case that we can already handle. An arbitrary set of constant random 
variables can always be adjoined to an independent set, and the independence will persist for the 
enlarged set. 
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handle an arbitrarily large finite number, and we need a way to pass to the limit. 
The passage to the limit for this situation is the simplest nontrivial application of 
the fundamental theorem of Kolmogorov that was mentioned in Section 1. The 
theorem will be stated and proved in Section 3. 

We conclude this section with two propositions about independence. 


Proposition 9.3. If x, ..., x, are independent random variables on a proba- 
bility space, then E(x, ---xyv) = E(x%1)--- E(xy). 


PROOF. If it a, is the joint distribution of x;, ..., X,, then it was observed 
after Proposition 9.2 that 


EGi---ay) = | th---tndp,, xg liens tn) (*) 
RY 


He es 


The independence means that du... (41,---5tn) = Ay, (ti) +++ diy, (tn). 
Then the integral on the right side of () splits as the product of N integrals, the 
j™ factor being Jptjd My, (t;). This j™ factor equals E(x;), and the proposition 
follows. 


Proposition 9.4. Let 
X15 ey Mkys Xkytds os Mkys Xkgtls o> Mkgs ee Nk ip tls ++ > Xk 
be k,, independent random variables on a probability space, define kp = 0, and 


suppose that F; : R“—*i-1 + R is a Borel function for each j with 1 < j <m. 
Then the m random variables Fj (x;,_,+1,---»x;) are independent. 


REMARKS. That is, functions of disjoint subsets of a set of independent random 
variables are independent. 


PROOF. Put yj = (X4,_,41,---+»%k;), and define y = (y1,-.-, Ym) and F = 
(Fi, ..., Fm). Let R; be the copy of Rii-k-1 corresponding to variables numbered 
k;-; + 1 through k;, and regard the distribution Lp (y,) of F; as a measure on R;. 
What needs proof is that 


Ley) = Mr) X 00° X MEO) (*) 


Both sides of this expression are Borel measures on R. On any product set 
A= A, X--- x Aj, where A; is a Borel subset of IR;, we have 


Hp (A) = P({o | F(y(@)) € A) 
= P({w | Fi(yj()) € A; for all j}) 


= P({w | yj(@) € F; '(Aj) for all j}) 

= TTji1 P({@ | yj(@) € F-'(Aj)}) by the assumed independence 
= TT, Pl | Fy(j)(@) € Aj) 

= [iat Heo) AD: 
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Consequently the two sides of («) are equal on all Borel sets. 


EXAMPLES. 

(1) If x1, x2,..., Xy are independent random variables and F), F2,..., Fy are 
Borel functions on R!, then F,(x;), Fo(x2), ..., Fy (xy) are independent random 
variables. 

(2) If x1, ..., xn are independent random variables and if s; = x; +--+ x;, 
then the two random variables s; and sy — s; are independent because s; depends 
only on x1, ..., x; and sy — s; depends only on xj41,..., xy. 


3. Kolmogorov Extension Theorem 


The problem addressed by the Kolmogorov theorem is the setting up a “stochastic 
process,” a notion that will be defined presently. Many stochastic processes have 
a time variable in them, which can be discrete or continuous. The process has a 
set S of “states;’ which can be a finite set, a countably infinite set, or a suitably 
nice uncountable set. It will be sufficient generality for our purposes that the set 
of states be realizable as a subset of a Euclidean space, the measurable subsets of 
states being the intersection of S with the Borel sets of the Euclidean space. The 
defining measurable functions tell the state at each instant of time. Accordingly, 
one might want to enlarge the definition of random variable to allow the range to 
contain S. But we shall not do so, instead referring to “measurable functions” in 
the appropriate places rather than random variables. 

Let us give one example of a stochastic process with discrete time and another 
with continuous time, with particular attention to the passage to the limit that is 
needed in order to have a probability model realizing the stochastic process. 

In the example with discrete time, we shall assume also that the state space S is 
countable. The probabilistic interpretation of the situation visualizes the process 
as moving from state to state as time advances through the positive integers, 
with probabilities depending on the complete past history but not the future; but 
this interpretation will not be important for us. Let us consider the analysis. 
In the n" finite approximation (Q,,An, P,) forn > 1, the set Q, is countable 
and consists of all ordered n-tuples of members of S, while A, is the set of 
all subsets of Q,. The measure P, is determined by assigning a nonnegative 
weight to each member of Q,,, the sum of all the weights being 1. As n varies, 
a consistency condition is to be satisfied: the sum over S of all the weights in 
Q,+41 of the (n + 1)-tuples that start with a particular n-tuple is the weight in Q,, 
attached to that n-tuple. The distinguished measurable functions’ that tell the 


7The measurable functions are random variables in this case since S C R. 
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result of an experiment are the n coordinate functions that associate to an n-tuple 
w its various entries. What is wanted is a single measure space (Q, A, P) that 
incorporates all these approximations. It is fairly clear that Q should be the set 
of all infinite sequences of members of S and that the distinguished measurable 
functions are to be the infinite set of coordinate functions. Defining A and P is 
a little harder. Each n-tuple w) forms a singleton set in A, and we associate 
to w” the set T,(w) of all members of 2 whose initial segment of length n is 
wo”), The members of A, are unions of these singleton sets, and we associate to 
any member X of A, the union 7,,(X) of the sets T,(@”) for @™ in X. Also, we 
define P(T,,(X)) = P,(X). In this way we identify A, with a o-algebra T,,(A,) 
of subsets of Q, and we attach a value of P to each member of 7,,(A,,). Define 


A = U Tn (An). 


The o-algebras 7,,(A,) increase with n, and it follows that the union of two 
members of A’ is in A’ and that the complement of a member of A’ is in A’; 
hence A’ is an algebra, and A can be taken as the smallest o-algebra containing 
A’. Inthe union defining A’, a set can arise from more than one term. For example, 
if a set X in A, is given and a set Y in A,4, consists of all (x + 1)-tuples whose 
initial n-tuple lies in X, then 7, (X) = T,,41(Y). The above consistency condition 
implies that P,(X) = P,+41(Y), and hence the two definitions of P on the set 
Tn (X) = Ty41(Y) are consistent. The result is that P is well defined on A’. Since 
the 7,,(A,) increase with n and since the restriction of P to each one is additive, 
it follows that P is additive. However, it is not apparent whether P is completely 
additive since the members of a countable disjoint sequence of sets in A’ might not 
lie in a single 7,,(A,,). This is the matter addressed by the Kolmogorov theorem. 

For purposes of being able to have a general theorem, let us make an observa- 
tion. Although the consistency condition used in the above example appears to 
rely on the ordering of the time variable, that ordering really plays no role in the 
above construction. We could as well have defined an F" finite approximation for 
each finite subset F of the positive integers; the above consistency condition used 
in passing from F = {1,...,n}toF’ = {1,...,n,n+1} implies a consistency for 
general finite sets of indices with F C F’: the result of summing the weights of all 
members of (2 whose restriction to the coordinates indexed by F is a particular 
member of (2; yields the weight of the member of Q. This observation makes it 
possible to formulate the Kolmogorov theorem in a way that allows for continuous 
time. 

Let us then come to the example with continuous time. The example is a model 
of Brownian motion, which was discovered as a physical phenomenon in 1826. 
Microscopic particles, when left alone in a liquid, can be seen to move along 
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erratic paths; this movement results from collisions between such a particle and 
molecules of the liquid. An experiment can consist of a record of the position 
in R? of a particle as a function of time. When the data are studied and suitably 
extrapolated to the situation that the liquid is all of IR*, one finds an explicit 
formula usable to define the probability that the moving particle lies in given 
subsets of IR? at a given finite set of times. Namely, for t > 0, define 


1 2 
t _ —|x-yl"/(4t) 
x,dy) = ——5e dy. 
p (x, dy) GansP y 
IfO0 = t) <t) <to-+- < tn, if Ao, ..., An are Borel sets in R?, and if the starting 
distribution of the particle at time 0 is a measure jz on R?, then the probability 
that the particle is in Ag at time 0, A, at time t},... , A,_, at time ¢,_;, and A, 
at time ¢, is to be taken as 


At Atn— 
/ / of / Dp "(Xn-1; dXy) Pp "l(Xp-2, dXn_1) 
xp€ Ag J x, EA] Xn—1€An-1 Xn€An 


x +++ x pA" (x9, dx1) du(xo), 


where At; = t;—tj-1 forl < j <n. Let F be {0,t,..., t,}. Amodel describing 
Brownian motion at the times of F takes Q to be the set of functions from F 
into R?,ie.,a copy of (IR*)"*", and the measurable sets are the Borel sets. The 
distinguished measurable functions are again coordinate functions;® they pick off 
the values in IR? at each of the times in F. Finally the measure P; takes the value 
given by the above formula on the product set Ap x --- x A,, and it is evident 
that Pr extends uniquely to a Borel measure on R?+) the value of P(A) for 
A C R"*! being the integral over A of the integrand in the display above. If F’ 
is the union of F and one additional time, then Py and Pr satisfy a consistency 
property saying that if x; is integrated over all of R?, then the integral can be 
computed and the result is the same as if index j were completely dropped in the 
formula; this comes down to the identity 


1 1 : . eo —21?/4(s+)) 

/ eo ly-2P/45) 9-ls-9P/ 4D gy = 

yeR3 (42rs)3/2 (42rt)3/2 (47r(s + t))3/2 
which follows from the formula f nee eo dx = 1, Fubini’s Theorem, and 


some elementary changes of variables. The passage to the limit that needs to 
be addressed is how to get a model that incorporates all t > 0 at once. The space 
can be (IR*)!9+°°), An algebra A’ can be built from the o-algebras of Borel sets 


8Since their values are not in R, these measurable functions are not, strictly speaking, random 
variables as we have defined them in Section 1. 
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of the Euclidean spaces (IR*)” , and an additive set function P can be consistently 
defined on A’ so that one recovers Pr on each space (R*)". What needs to be 
addressed is the complete additivity of P. 

A stochastic process is nothing more than a family {x; | i € 1} of measurable 
functions defined on a measure space (Q2,.A, P) with P(Q) = 1. The index 
set J is assumed nonempty, but no other assumptions are made about it. The 
measurable functions have values in a more general space $ than R, but we 
shall assume for simplicity that S is contained in a Euclidean space R% and 
then we may take S equal to R“. Although stochastic processes generally are 
interesting only when the measurable functions are related to each other in some 
special way, the Kolmogorov theorem does not make use of any such special 
relationship. It addresses the construction of a general stochastic process out of 
the approximations to it that are formed from finite subsets of J. 

The situation is then as follows. Let J be an arbitrary nonempty index set, let 
the state space S be R% for some fixed integer N, and let Q = S! be the set of 
functions from J to S. We let x;, fori € J, be the coordinate function from Q 
to S defined by x;(w) = w(i). For J C J, we let xy = {x; | i € J}; this isa 
function carrying Q to S’. 

For each nonempty finite subset F of J, the image of xf is the Euclidean space 
S* , in which the notion of a Borel set is well defined. A subset A of Q will be 
said to be measurable of type F if A can be described by 


A=xp (X) ={w€ Q| xp € X} for some Borel set X C S*. 


The collection of subsets of Q that are measurable of type F is a o-algebra that 
we denote by A;. If F and F’ are finite subsets of J with F C F’ and if the Borel 
set X of S* exhibits A as measurable of type F, then the Borel subset X x S’’~* 
of S’’ exhibits A as measurable of type F’. Consequently Ar © Ap. 

Let A’ be the union of the A; for all finite F. If F and G are finite subsets of 
I, then we have Ar C Arug and Ag © Arua, and it follows that A’ is closed 
under finite unions and complements. Hence A’ is an algebra of subsets of Q. 

In effect the Kolmogorov theorem will assume that we have a consistent system 
of stochastic processes for all finite subsets of J. In other words, for each finite 
subset F of J, we assume that we have a measure space (S", Br, Pr) with Br 
as the Borel sets of the Euclidean space S’, with Pr(S‘) = 1, and with the 
distinguished measurable functions taken as the x; fori in F. The measures Pr 
are to satisfy a consistency condition as follows. To each X in Br, we define 
a subset Ay of Q by Ay = te; this subset of Q is measurable of type F,, 
and we transfer the measure from By to Ay by defining Pr(Axy) = Pr(X). 
The consistency condition is that there is a well-defined nonnegative additive set 
function P on A’ whose restriction to each Ar is Pr. The content of the theorem 
is that we obtain a stochastic process for J itself. 
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Theorem 9.5 (Kolmogorov Extension Theorem). Let J be a nonempty index 
set, let S = R%, and let Q = S’ be the set of functions from J to §. For each 
nonempty finite subset F of J, let Ar be the o-algebra of subsets of Q that are 
measurable of type F’, and let A’ be the algebra of sets given by the union of 
the A; for all finite F. If P is a nonnegative additive set function defined on A’ 
such that P(Q2) = 1 and P | ae is completely additive for every finite F’, then P 
is completely additive on A’ and therefore extends to a measure on the smallest 
o-algebra containing A’. 


PROOF. Once we have proved that P is completely additive on A’, P extends 
to a measure on the smallest o-algebra containing A’ as a consequence of the 
Extension Theorem.” Let A, be adecreasing sequence of sets in A’ with P(A,) > 
€ > 0 for some positive €. It is enough to prove that (}"°,, A, is not empty. 

Each member of A’ is measurable of type F for some finite F , and we suppose 
that A, is measurable of type F,,. There is no loss of generality in assuming that 
F, C Fy © --- since a set that is measurable of type F is measurable of type F’ 
for any F’ containing F’. Let x;, fori € I, be thei ‘h coordinate function on Q, 
and let xp = {x; | i € F} for each finite subset F of 7. Just as in the definition of 
joint distribution, we define a Borel measure ju on the Euclidean space S’ by 
UF(X) = PGs (X)). This is a measure since P | re is assumed to be completely 
additive. 

By definition of “measurable of type F’;” the set A, is of the form 


An = {@ € Q| xp,(@) € Xn} 
for some Borel subset X,, of the Euclidean space S Fn Since P(A,) > €, the 
definition of wr, makes wp, (X,) > €. Since S*» is a Euclidean space, the 
measure jf, is regular. Therefore there exists a compact subset K,, of X,, with 
Ur(Xn — Ky) < 3-"e. Putting 

Bn = {@ € Q| xp,(@) € Kn}, 


we see that P(A, — B,) < 3~"e. Let 


Each C,, is a subset of A,, and the sets C, are decreasing. We shall prove that 


P(C,) = €/2. (*) 


°Theorem 5.5 of Basic. 


3. Kolmogorov Extension Theorem 391 


The proof of (*) will involve an induction: we show inductively for each k 
that By = Dy U Cy with P(Dy) < Si) 3-Je and P(Cy) = (1 — Y4_, 37S )e. 
Since | — Ss cael Ue ee cin = }, this induction will prove 
(x). The base case of the induction is k = 1. In this case we have C, = B,. If 
we take D; = ©, then we have B; = D; UC; and P(D,) < 0 trivially, and we 
have P(C}) > d—- ie by construction of B,;. The inductive hypothesis is that 
By = Dy UC, with P(Dy) < Yi) 3-Fe and P(Cy) = (1 — Yf_, 3-/)e. We 
know that Ay = (Ag — By) U By. Since By4, C Apy, © Ag, We can intersect 
B,4, with this equation and then use the inductive hypothesis to obtain 


Bry = (Bra (Ag — Bey) U (Bra 9 Bx) 
= (Beni (Ag — Be)) U (Bey 9 (Dg U Cx) 
= (Bey N (Ag — By)) U (Bei 1 De) U Cig. 


If we put Dea) = (Bear (Ag — Bg)) U (B41 9 Dg), then Bey) = Dgai U Cr 
and 


k-1 k ; 
P(De41) < P(Ax — By) + P(Dy) <3 + Yo 3-Je = V3. 
j=l j=l 


The identity Ag4; = (Ag4i — Br+1)U Bey, and the inequalities P(Ax41) > € and 
P(Agyi — Bei) < eer together imply that P(By41) => Ud — 3-*-e. From 
Busy = Dey U Cyyy and P(Dyi1) < am 3-/e, we therefore conclude that 
P(Cy4y1) = (1 — pee 3-4 Je. This completes the induction, and («) is thereby 
proved. 

The set C,, is in Ap, since F; C Fy C--- C F,, and thus C,, is given by 


CC, = {we Q | xp, (@) E Lt 


Fi 


for some Borel subset L, of K, in S*". For 1 < j <n, we have 


Bj = {w € Q| xp,(@) € Kj x S*-"}, 


and the set K; x S*"—" is closed in S$’ for j <n and compact for j =n. Thus 
Ln = (\j_1(Kj x S*"~") is a compact subset of S*". 

If F C F’, let us identify S” with the subset S”’ x {0} of Q = S!, so that it is 
meaningful to apply x to SF’. Then we have xpxp = xp, and Xp, (Ly) makes 
sense for p > n. 

If p> q,then we have x;,' (Lp) SC eG xp, (Lq) = xp (Lg x Sfr—Fa) 


and hence Lp © Lg x S'»~'s, Application of xp, gives xz,(Lp) © Ly. If 
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p = q = n, then the further application of xp, gives xp,(Lp) © xp,(Lq) E Ln. 
Thus the sets xp,(L,), as p varies for p > n, form a decreasing sequence of 
compact sets in S$". Since P(C,) > €/2 by («), Cp is not empty; thus L, is not 
empty and x, (Lp) is not empty. Since L,, is a compact metric space, 


[o.e) 
Mn = () <r, (Lp) 
p=n 
is not empty. 
Let us prove that 
XF, (Mn+1) = M,. (+) 


For p > n+ 1, we have xp,(Mn+1) © xp,(%r,,,(Lp)) = xp, (Lp). Intersecting 
the right side over p gives xr,(M,+1) C M,. For the reverse inclusion, let m be 
in M,. Thenm = xp, (€p) with £, € Lp for p => n+1. For the same £,’s, define 
mM), = XF,,,(€p). Then xp, (m),) = Xr, (XF,,, (lp) = xr, (€p)) = m. The element 
m), is in Xp,,, (Lp) and hence in eee XF,,,(Lq). The elements m, all lie in the 
compact set L,,,, and hence they have a convergent subsequence {m',, }. The limit 
m’ of this subsequence is in preeree XF,,,(Lq) for all k, and thus m’ is in My41. 
Since xr, (m',) = m, we have xp, (m') = xr, (lim, ae | = lim; Xf, (m',,) =m. 
In other words, m lies in xp, (Mn+1). This proves («*). 

Using (**), we shall define disjoint coordinate blocks of an element @ in Q. 
Pick some m, in Mj, use (*) to find some mz in Mz with m; = xF,(mz), use 
(x) to find some m3 in M3 with m2 = xp,(m3), and so on. Define @ so that 
Xp,(@) = m, and xp, fF, ,(@) =m, — m,_, forn > 2. Define @ to be 0 in all 
coordinates indexed by J — Lp, Fn. Then we have 


n n 

XF, (@) = Xp, (@) + DI XK, (@) =m + DE (ME — MEI) = Mn. 
k=2 k=2 

Thus x-, (@) is exhibited as in M, C L, for all n. Hence @ is in (lect C,, and 

we have succeeded in proving that (\?°_, C, is not empty. 


Corollary 9.6. Let J be a nonempty index set, and for each i in J let yz; be a 
Borel measure on R with jz;(IR) = 1. Then there exists a probability space with 
independent random variables x; for i in J such that x; has distribution j;. 


PROOF. In Theorem 9.5 let S = R, and for each finite subset F of J, define 
P | A, © be the product measure [ |; , 4; on the Euclidean space R” . The theorem 
makes R/ into a probability space by exhibiting the consistent extension P of 
all the P | Ap 8 as completely additive. Then the coordinate functions x; are the 
required independent random variables. 
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Traditional laws of large numbers concern a sequence {x,} of identically dis- 
tributed independent random variables, and we shall assume that their common 
expectation F exists. Define s, = x; +---+x, forn > 1. The conclusion is 
that the quantities 1 Sy converge in some sense to E, 1.e., that the x, are Cesaro 
summable to E. The simplest versions of the law of large numbers assume also 
that the common “variance” is finite. Let us back up a moment and define this 
notion. 
The variance of a random variable x with mean E is the quantity 
Var(x) = E((x — E)?) = E(x*) — E’, 
the right-hand equality holding since 
E((x — E)’) = E(x?) — 2E(x)E + E°E(1) = E(x”) — E’. 


For any random variables the expectations add since expectation is linear. For 
two independent random variables x and y, the variances add since we can apply 
Proposition 9.3, compute the quantities 

E(x + y)’) = E(x*) + 2B (xy) + E(y’) = E(x’) + 2E(@)E(y) + EQ”) 
and (E(x) + EQ) = EQ)’ + 2E@)EQ) + EO, 
and subtract to obtain 

Var(x + y)) = (E(x*) — E(x)*) + (EQ*) — E(y)’) = Var(x) + Var(y). 
For a constant multiple c of a random variable x, we have 

E(cx)=cE(x) and ———~‘Var(cx) = c’Var(x). 

Returning to our sequence {x,} of identically distributed independent random 
variables, we therefore have E(s,) = E(x,;) +---+ E(x) =nE and Var(s,) = 
Var(x,) +--+ -+ Var(x,) = no”, where o7 denotes the common variance of the 
given random variables x;,. Consequently 

E(is,)=E and = Var(4s,) = 40”. 


If we take our probability space to be (QQ, P) and apply Chebyshev’s inequality 
to the variance!® of 7 Sn, we obtain 


tor= fds — Ey’ dP > &P (ll; sn — El > &}). 
Q 


Holding & fixed and letting n tend to infinity, we obtain the first form historically 
of the law of large numbers, as follows. 


!0Chebyshev’s inequality appears in Section VI.10 of Basic and is the elementary inequality 
tx fl? du>= BE u({x | |\f(@)| = &}) valid on any measure space for any measurable f and any real 
—é>0. 
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Theorem 9.7 (Weak Law of Large Numbers). Let {x,} be a sequence of 
identically distributed independent random variables with a common expectation 
E and acommon finite variance. Define s, = x; +----+x,. Then for every real 
6, 

lim P({|4 5, — E| = &}) =0. 
n—- Oo 


The statement in words is that 1 S, converges to E in probability. With more 
effort one can prove the same theorem without the hypothesis of finite variance. 

As a practical matter, the fact that P({I4 5, —E|>é& }) tends to 0 is of 
comparatively little interest. Of more interest is a probability estimate for the 
event that lim 1 S, = E. This is contained in the following theorem, whose proof 
will occupy the remainder of this section. 


Theorem 9.8 (Strong Law of Large Numbers). Let {x,} be a sequence of 
identically distributed independent random variables whose common expectation 
E exists. Define s, = x; +---+x,. Then 

lim 1s, =E with probability 1. 


n> Co 


Many members of the public have heard of this theorem in some form. Mis- 
conceptions abound, however. The usual misconception is that if the average 
1 S,(@) has gotten to be considerably larger than E by some point n in time, then 
the chances become overwhelming that the average will have corrected itself 
fairly soon thereafter. Independence says otherwise: that the future values of 
the x;,’s are not influenced by what has happened through time n. In fact, if 
a person is persuaded that it was unreasonable for the average is, (@) to have 
gotten considerably larger than E by some time n, then the person might better 
instead question whether the expectation F is known correctly or even whether the 
individual x,,’s are genuinely independent. If EF has been greatly underestimated, 
for example, not only was it reasonable for the average 1 S,(@) to have gotten 
considerably larger than £, but it is reasonable for it to continue to do so. 

The proof of Theorem 9.8 will be preceded by three lemmas. 


Lemma 9.9 (Borel—Cantelli Lemma). Let {Ax} be a sequence of events in a 
probability space (Q, P) such that °°, P(Ax) < 00. Then P( (Vr, Ups Ak) 
= 0. Hence the probability that infinitely many of the events Ax occur is 0. 

PROOF. Since )°7°, P(A;) is convergent, we have lim sup, )-7—,, P(Ax) = 0. 
For every n, we have P( (Vp) gen Ak) < P(Upen Ak) S Veen P(A). The 
left side of the inequality is independent of n, and therefore P( Peale. Ax) < 
lim sup, >-72,, P(Ac) = 0. This proves the first conclusion. Since the set 
(ho Use, Ae is the set of @ that lie in infinitely many of the sets Ax, the 


n=1 
second conclusion follows. 
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Lemma 9.10. Let x be a random variable on a probability space (Q, P). Then 
bee P({|x| > k}) < oo if and only if the expectation of |x| exists. 
PROOF. Proposition 6.56b of Basic gives 
JglxldP = fo” P({lx@)| > &)) dé. 


The lemma therefore follows from the inequalities 


s* P({el > A) = 3 Pel > E+) < ¥ Sf! Pel > ae 
k=1 k=0 k=0 


= [> PUlx| > dé < ¥> P(x] >). 
k=0 


Lemma 9.11 (Kolmogorov’s inequality). Let x;, ..., x, be independent ran- 
dom variables on a probability space (Q, P), and suppose that E(x,) = 0 and 
E(x?) < oo forall k. Put s, = xj +---+x,. Then 


P({o| max(|sj|,.--, [Spl) > c}) < c *E(s*) 
for every real c > 0. 


REMARKS. It is not necessary to assume that E(x,) = 0. Forn = 1, the 
lemma consequently reduces to Chebyshev’s inequality. 


PROOF. Let Aj; be the event that j is the smallest index for which |s;| > c. 
The sets A; are disjoint, and their union is the set whose probability occurs on 
the left side of the displayed inequality. Combining this fact with Chebyshev’s 
inequality gives 


n n 
P({@| max(|si|,..., |snl) > ch) = DO P(A) Se? Ye EGFIy), 
j=l j=l 
where [ Aj is the indicator function of A;. Since s, = sj + (Sn — sj), 


E(sil,,) = E(s}14,) + 2E (Gn — 8/)8j14,) + En — 5))°L4,) 
> E(s?I,,) + 2E((sn — 5;)5j14,). 


The random variables sy — s; and sj/,4, are independent by Proposition 9.4, 
and their product has expectation 0 by Proposition 9.3 since E(s, — sj) = 
Di=j+1 Ei) = 0. Therefore E(s?1 Ves E(s71 4,)> and («) gives 


n n 
P({| max((si|,...,[Snl) > c}) <7 > E(sjIy,) Se oS E(si1,,) 
A ye 


=> CCE ly, AS) < c 7 E(sp)°, 


as required. 
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PROOF OF THEOREM 9.8. Let the underlying probability space be denoted by 
(Q, P). Subtraction of the constant E from each of the random variables x, does 
not affect the independence, according to Proposition 9.4, and it reduces the proof 
to the case that E = 0. Therefore we may proceed under the assumption that 
E = 0. For integers k > 1, define 


, XE where |x;| < k, 
i= 


0 where |x;z| > k, 


0 where |x;| < k, 
ied Ee | xx] S 


XE where |x;| > k, 


so that x, = x, + x;. Define s) = x, +--+ +x) ands) = x/+---+x/". Itis 
enough to show that 1 s/ and 1 s” both tend to 0 with probability 1. 

First we show that ist tends to 0 with probability 1. Let x be a random 
variable with the same distribution as the x;’s. Referring to the definition of x/, 
we see that P({|x| > k}) = P({|xx| > k}) = P({x/ 4 0}). Since E(|x|) exists 
by assumption, Lemma 9.10 shows that oy P({|x| > k}) < oo. Therefore 
ye Px 4 0}) < oc. By the Borel—Cantelli Lemma (Lemma 9.10), the 
probability that w lies in infinitely many of the sets {x # 0} is 0. Thus by 
disregarding w’s in a set of probability 0, we may assume x//(w) # 0 for only 
finitely many k. Then s/’(w) remains constant as a function of n for large n, and 
we must have lim, 1 So) =O; 

Now we consider ts The random variables x, are independent, but they 
are no longer identically distributed and they no longer need have expectation 0. 
However, they satisfy inequalities of the form |x;,| < k, and these in turn imply 
that each E (x;”) is finite. Concerning the expectations, let x be a random variable 
with the same distribution as any of the x,’s. The random variable x? equal to 
x where |x| < &k and equal to 0 otherwise has [xf | < |x| for all k, and hence 
dominated convergence yields lim, FE (xf) = E(x) =0. Since x, and Xe have the 
same distribution, we have lim; E(x;,) = 0. The expression E (457) is a Cesaro 
sum of the sequence {£ (x;,)}. Since the Cesaro sums tend to 0 when the sequence 
itself tends to 0, we conclude that 


. 1 
lim EC; Ayres (*) 
Let yz be the common distribution of the |x;|’s. The next step is to show that 
00 v1 = 
Mo Se EG we? |, eae): (=) 
r=1 k=2r-1 


The quantity on the right is twice the common value of E(|x,|) and is finite since 
we have assumed that the common expectation of the x;’s exists. Once we have 
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proved (+), we can therefore conclude that the quantity on the left side is finite. 
To prove (+), we write 


i 


27-1 [ee] 27-1 
So 2-2 YY ERZ=H27 YL fF Pdua) 
r=1 k=2'-1 r=1 k=2r-! 
ee) ar 
ap ea if t? du(t) 
r=1 
12 St yer £2" 42 
< fi tdu@)+ 027 f, t* due). 
rat 


Let us write I and II for the two terms on the right side. The estimate for IT is 


(oe) r 
i ame a ¥ GP dua) s Ye O22! fi tduo 
r=1 j=1 r=1j=1 
CO. eee A Dl 2/ oo 
= 2 2! te) 2 5 Jpatdut) =2 fP tdpt). 
j=lrsj j=l 
Therefore 


It < fp Adu) +2 f° rdute) 
<2 fi tdu(t)+2f°tdut) =2 f° tduo), 


and (+) is proved. 

Form the sequence of random variables xf = x, — E(x;,), and put s*¥ = 
xj +-+-+x%*, The xf are independent but no longer identically distributed. They 
have expectation 0. Since 


E (xj?) = E((x, — E(x))”) = EQ@)2) — EQ, < E@,?), 


00 4 
(xx) shows that the xz have the property that )> 2-" ° E(x?) < oo. To 
r=l1 k=2r-1 
PENG the hae it would be enough to prove that the Cesaro sums 1 ss = 
is! — E(t4s /) tend to 0, since we know from (x) that lim, E(4s') 0+ 
Changing notation, we see that we have reduced matters to proving the fol- 
lowing: if {x,} is a sequence of independent random variables with expectation 0 
and with 


[ee) 2" -1 
ore Oe BaZa oo: (+) 


r=1 k=2'-! 


and if s, denotes x; + ---+ x,, then lim, 1 Sn = 0 with probability 1. 
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To prove this assertion, we apply Kolmogorov’s inequality (Lemma 9.11) for 
each r > 0 to the 2’~! random variables x>-1, Xor-141,+++,Xgr_1. These are 
independent with expectation 0, and E(x;) is finite for each by (+). Their partial 
sums are 

Sor-l — Sor-1_y,..., Sor—-j — Spr-1_], 


and the last partial sum has EF ((sxr—1 —Sor-1_] 7) = ae E (x7) by Proposition 
9.3. Kolmogorov’s inequality therefore gives, for any fixed e > 0, 


r 


1 
P({max(|8or-1 — Syr-1_y], -.-, [Sor — Syr1_y|) > 2"e}) S727 E(x). 
k=2r-1 


Summing on r and applying (+), we see that 
[o.e) 
 P({max(27 |syr-1 — Syr-1_y|,..., 277 [Sar-1 — Syr-1_y |) > €}) < 00. 
ri 
The Borel—Cantelli Lemma (Lemma 9.9) shows that with probability 1, there are 
only finitely many r’s for which 
max(2"|sor-1 — Spr-1_y|,..., 2°" |Sor_1 — Spr-1_1]) > €. 


Fix any that is not in the exceptional set A, of probability 0, and choose 
ro =1o(@) such that 


max(2~" |Syr-1(@) — Spr-1_1(@)|, ..., 27” |Sar_-1(@) — Syr-1_1(@)|) < € 
for allr > ro. Ifn > 2” is given, find r such that 2-1 <n <2" —1. Then we 
have 
27" |Sn(@) — Sy-1_1(@)| < €, 


2- ©) [S511 (@) — Sy-2_1(@)| < €, 


27" |S2r0—-1(@) — Syr-1_ (@)| < €. 
Multiplying the k" inequality by 2~**? 
triangle inequality, we obtain 


, summing for k > 1, and applying the 


n"|Sn(@) — Syr9-1(@)| S 27" |5n(@) = 80-11 (@)| S 4€. 
Therefore n! |Sn(@)| < 4e+ n! |Soro-1_ 1 (@)|. 
Hence lim sup 1 |S,(@)| < 4e. 
n 
If w is not in the union Ra seer A\/m Of the exceptional sets, then lim sup, 1 |Sn(@)| 


= 0. This countable union of exceptional sets of probability 0 has probability 0, 
and the proof is therefore complete. 
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BIBLIOGRAPHICAL REMARKS. The proof of Theorem 9.5 is adapted from 
Doob’s Measure Theory, and the proof of Theorem 9.8 is adapted from Feller’s 
Volume II of An Introduction to Probability Theory and Its Applications. 


5. Problems 


1. Ifx isarandom variable with distribution ju,., find a formula for the distribution 
[Ax, Of |x| in terms of p. 


2. Let x1,...,xy be random variables on a probability space (Q, P), let w,, 
be their joint distribution, and let ® : RY — R be a nonnegative Borel function. 
Prove that 


Jp O,...,tw) du, 


Ane 


se EL . ty) = Sa S aoc, 


peeeg? 


where eee, is the distribution of ® 0 (xj,..., xy). 


sere x. 


(oe) 


3. Suppose on a probability space (Q, P) that {y,}° 
variables with a common expectation F and with variance a; and suppose that 
® : R — R is a bounded continuous function. 

(a) Prove that P({|y, — E| > 5}) < 026-? forall n. 

(b) Suppose that |®| < M and that 6 and € are positive numbers such that 
|t — E| < d implies |®(t) — ®(E)| < €. Prove that |E(®(y,)) — ®(E)| < 
€+2Mo25~?. 

(c) Prove that if lim, oP = 0, then lim, E(®(y,)) = P(E). 

(d) Show that the argument in (c) continues to work if ® is the indicator function 
of an interval whose closure does not contain E. Why does the conclusion 
in this case contain the conclusion of the Weak Law of Large Numbers as in 
Theorem 9.7? 


4. (Bernstein polynomials) This problem gives a constructive proof of the Weier- 
strass Approximation Theorem by using probability theory. 

(a) Fix p with 0 < p < 1. A certain unbalanced coin comes up “heads” with 
probability p and “tails” with probability 1 — p; “heads” is scored as the 
outcome |, and “tails” is scored as the outcome 0. Set up a probability model 
(Q, P) for a sequence of independent coin tosses of this unbalanced coin, 
and let x, be the outcome of the n" toss. 

Show that the expectation of the outcome of a single toss of the coin is p 
and the variance is p(1 — p). 

(c) Let s, = x; +--: +x,. Show for each integer k with k < n that 
P({5n = k}) = (pe — py. 

For continuous ® : [0,1] — R, extend ® to all of R so as to be constant 
on (—oo, 0] and on [1, +00). Apply the result of Problem 3c to show that 
linn Vio (7) () PX — py" * = OCD). 


, iS a sequence of random 


(b 


wm 


(d 


w 
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(e) Prove that the convergence in (d) is uniform for 0 < p < 1, and conclude 
that ® is the uniform limit of an explicit sequence of polynomials on [0, 1]. 


Problems 5—9 are closely related to the Kolmogorov Extension Theorem (Theorem 
9.5) and in a sense explain the mystery behind its proof. Let X be a compact metric 
space, and for each integer n > 1, let X, be a copy of X. Define OQ’ = X ute ae 
and lett Q= X er, Each of Q™? and Q is given the product topology. If E 
is a Borel subset of Q%?, we can regard E as a subset of Q by identifying E with 
Ex(X aD In this way any Borel measure on Q“? can be regarded as a 
measure on a certain o-subalgebra Fy of the o-algebra 6(Q) of Borel sets. 

5. Prove that J”, F, = Fis an algebra of sets. 


6. Let v, be a (regular) Borel measure on Q” with v(Q™) = 1, and regard v, 
as defined on F,,. Suppose for each n that v, agrees with v,,; on F,. Define 
v(E) for E in F to be the common value of v,(£) for n large. Prove that v is 
nonnegative additive, and prove that in a suitable sense v is regular on F. 


7. Using the kind of regularity established in the previous problem, prove that v is 
completely additive on F. 


8. In view of Problems 6 and 7, v extends to a measure on the smallest o-algebra 
for Q containing F. Prove that this o-algebra is B(Q). 


9. Let X be a 2-point space, and let v, be 2~” on each one-point subset of 2), so 
that the resulting v on Q is coin-tossing measure on the space of all sequences of 
“heads” and “tails 2’ Exhibit ahomeomorphism of ( onto the standard Cantor set 
in [0, 1] that sends v to the usual Cantor measure, which is the Stieltjes measure 
corresponding to the Cantor function that is constructed in Section VI.8 of Basic. 


Problems 10-14 concern the Kolmogorov Extension Theorem (Theorem 9.5) and its 
application to Brownian motion. If J is a subset of the index set J, a subset A of Q 
will be said to be of type J if A can be described by 


A=x;\(E) ={@€ 2 |x; € E) for some subset E C S’. 


As in the statement of the Kolmogorov theorem, let A’ be the smallest algebra 

containing all subsets of Q that are measurable of type F for some finite subset 

F of I. Let A be the smallest o-algebra containing A’. 

10. From the fact that the collection of subsets of © that are of type J is a o-algebra, 
prove that every set in A is of type J for some countable set J. 

11. Form Brownian motion for time J = [0, T] by means of the Kolmogorov Exten- 
sion Theorem. Let C be the subset of continuous elements w in Q. Prove that C 
is not in A. 

12. With C as in Problem 11, prove that the only member of A contained in C is the 
empty set, and conclude that the inner measure of C relative to P is 0. 


13. 


14. 
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Still with C as in Problem 11, suppose that F is a subset of Q of type J for some 
countable J and that C C E. Prove that the set C; of elements w in Q that are 
uniformly continuous on J is contained in FE. 


Still with C as in Problem 11, suppose for every countable subset J of J that the 
set C; of elements w in Q that are uniformly continuous on J is in A and has 
P(C,) = 1. Prove that the outer measure of C relative to P is 1. 


HINTS FOR SOLUTIONS OF PROBLEMS 


Chapter I 
1. We start from 


es . 1 1 
Jo Sin PnX SiN Dmx dx = -5 9 COS(Pn + Pm)x dx + 5 Jo COS(Dn — Pm)x dx. 


The first term on the right is equal to 


a sin(Pn + Pm)! = — 55 (sin Pnl COS Pml + COS Ppl Sin Py!) 
= ae | — 2 COs Pal COS Pml — F* COS Ppl COS Pm) 
= + st (Pn + Pm) COS Ppl COS Pml = a COS Ppl COS Pml. 
Similarly the second term on the right is — x COS Ppl COS Pml. The two terms cancel, 


and the desired orthogonality follows. 


2. In (a), the adjusted operator is L(u) = (U1 — t?)u’)’, and Green’s formula gives 


1 
On = im) ff P,(t) Pm (t) dt = (L(Pn), Pn) = (Ph; L(Pn)) 
-l 
= [0 —7)(PLO Pat) — PaOPLO)] 


where A, and i,, are the values 4, = —n(n + 1) and A,, = —m(m + 1) such that 
L(P,) = AnP, and L(Pm) = 4m Pm. The right side is 0 because | — t2 vanishes at 
—land1l. 

In (b), the adjusted operator is L(u) = (tu’)' + tu, and L(Jo(k - )) equals —k?t if 
Jo(k) = 0. Green’s formula gives 


(—k2 +2) fo) Jo(knt) Jo(kmt)t dt 
= (L(Jo(kn+)). Jo(km + )) — (Jo(kn +), L(Jo(km + ))) 
[t(£ (Jon -))O)JoKmt) — Jo(knt) £ (Jon ))O)]p- 


The expression in brackets on the right sideisO att = 1 because Jo(k,) = Jo(Km) = 0, 
and it is 0 at t = 0 because of the factor fr. 


403 


404 Hints for Solutions of Problems 


3. With L(u) = (p(t)u’)’ — q(t)u, the formula for u*(t) = hs Gott, s) f(s) ds in 
the proof of Lemma 4.4 is 


u*(t) = p(c)!(— gi) fi g2(s) f(s) ds + g2(t) J’ gis) f(s) ds). 


As is observed in the proof of Lemma 4.4, the derivative of this involves terms in 
which the integrals are differentiated at their upper limits, and these terms drop out. 
Thus 


u(t) = po) "(-— 910 f, 2s) f(s) ds + 0 fi gi(s) f(s) ds). 


For the second derivative, the terms do not drop out, and we obtain 


u(t) = po) '(— gf 0) fF gr(s) f(s) ds + 931) f; gi(s) f(s) ds) 
+ ple) '(- gj OmOfO+ OGM fO). 


When we combine these expressions to form p(t)u*”(t) + p’(t)u*’(t) — q(t)u*(t), 
the coefficient of ‘i g2(s) f (s) ds is —p(c)!L(1) = 0, and similarly the coefficient 


of f. gi(s) f(s) ds is p(c)-!L(g2) = 0. Thus 


Li) = po) pOFO(- 4. MMM + Mg) 
= p(c)' p(t) fo) det Wi, g(t) = f(), 


the value of det W(¢1, ¢2) having been computed in the proof. This completes (a). 
For (b), we can take g(t) = cost and g2(t) = sint. Since p(t) = 1, we obtain 


sint coss — cost sins ifs <t, 
Golt, 5s) = 


ifs >f. 


The conditions u(0) = 0 and u(z/2) = 0 mean thata = 0,b = 77/2,c; = d; = 1, 
and cz = dy = 0 in (SL2). Thus the system of equations (*) in the proof of Lemma 


4.4 reads 
cos0 sin0O ky \ | —u*(0) 
cos sin 5 ky) \ —u* (2/2) )’ 


and we obtain k} = —u*(0O) = 0 and ky = —u* (2/2) = ae f(s)coss ds. The 
proof of Lemma 4.4 says to take K,(s) = 0 and K2(s) = —coss. The formula for 
Gi(t, s) is Gi(t, s) = Go(t, s) + Ki (s)g1(t) + K2(s)¢2(t), and therefore 


sint coss — cost sins : —cost sins 
G\(t,s) = —sintcoss = : . 
@) — sinftcoss 


In particular, Gj (t, s) is symmetric, as it is supposed to be! 
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4. We have uk ((py,)'y2 — (py5)'y1) dt = L? (go — g1)yiy2 dt > Oas aresult of 


the outlined steps. Since ((py})'y2 - (py5)'y1) = £ (pO y2 - yiyy))s we conclude 


that [ p(y} y2 — yiy5)) Jp > 0. This proves (a). 

Since yi(t1) = yi(tz2) = 0, the expression p(t)y;(t)y2(t) — pyi()y3(4) is 
P(tz)y; (tz) ya(t2) at f = ty. Here p(tr) > 0 and yo(t2) = 0. Since y(t) = 0 
and since yj(t) > O for all ¢ slightly less than t2, we obtain yj (t2) < 0. Thus 
P(t2) y; (42) y2(t2) < 0. Similarly the same expression is p(t1) yj (t1)y2(t1) att = ty. 
We have p(t;) > 0 and y2(t)) > 0. Since yj (t) = 0 and y(t) > O for ¢ slightly 
greater than t;, we obtain y;(t;) > 0. Thus p(t)y;(t1)y2(t1) = 0. This gives the 
desired contradiction and completes (b). 

Part (c) is just the special case in which gi(t) = —q(t) + Air(t) and go(t) = 
—q(t) + Aar(t). The hypothesis on g2 — gi is satisfied because go(t) — gi(t) = 
(A2 — A1)r(t) > 0. 

5. For (a), substitute for V(x, t) and get —wW’(x)o(t) + V@)W(x)g(t) = 
iv(x)g'(t). Divide by w(x)g(t) to obtain i + V(x) = ice. The left side 
depends only on x, and the right side depends only on ft. So the two sides must be 


some constant E. Then —¥“@ + V(x) = E yields W” + (E — V(x) = 0. 


wx) 
For (b), the equation for ¢ is ime = E. Then g' = —iEg, and g(t) = ce7'*". 


6. We substitute w(x) = e-* /2 A(x), W(x) = —xe7? 27H (x) + e* 2 H'(x), 
and w(x) = x2e-* /2 (x) — 2xe* /2-H' (x) +e-* 2H" (x) —e* /2 (x), and we 
are led to Hermite’s equation. 

7. Write H(x) = ya cpx*. We find that co and cj are arbitrary and that 
(kK + 2)(k + lcey2 — (2n — 2k)cy = 0 for k > 0. To get a polynomial of degree d, 
we must have cg 4 0 and cg+2 = 0. Since cg42 = ca(2n — 2d)/((d + 2)(d + 1)), 
this happens if and only if d =n. 

8. We have L(H, (x)e7*/?) = —(2n+ 1)H, (x)e?"/?, Define an inner product 
by integrating over [—N, N]. Then 


—2(n — m) [Hy (x) Hm (xem dx 
= (L(A, (xe ?), Hn (ae * /?) — (Hae, L( Hm (xe?) 
= [Hae 7) (Hine? 7) — (Hae) (Hn DE? ] - 


As N tends to infinity, the right side tends to 0. Since n 4 m, we obtain the desired 
orthogonality. 


Chapter II 


1. A condition in (a) is that f take on some value on a set of positive measure. 
A condition in (b) is that f take on only countably many values, these tending to 0, 
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and that the set E where f is nonzero be the countable union of sets E,, of positive 
measure such that no E,, decomposes as the disjoint union of two sets of positive 
measure. 


2. Let v, be inimage(AJ — L) with v, > v,and choose u, with (AT — L)uy = vy. 
We are to show that v is in the image. We may assume that v ¥ 0, so that ||v,|| is 
bounded below by a positive constant for large n. Since ||v, || < ||AZ —LZ]l[lunll, lun 
is bounded below for large n. Passing to a subsequence, we may assume either that 
||u,, || tends to infinity or that ||w,,|| is bounded. 

If ||u,,|| is bounded, then we may assume by passing to a subsequence that {Lu,} 
is convergent, say with limit w. From Au, = Lu, + v,, we see that Au, > w+ v. 
Put u = A~!(w + v). Then (AJ — L)u = (w+ v) —limLu, = w+v—w = v, and 
v is in the image. 

If ||u,|| tends to infinity, choose a subsequence such that {L (||, ||~!u,)} is con- 
vergent, say to w. Then we have ||uy||~!Aun — L(|lun||~!un) = |lun||~! vn. Passing 
to the limit and using that v, — v, we see that ||u,||~!Au, — w. Applying L, we 
obtain Aw = L(w). Thus (AJ — L)w = 0. Since AJ — L is one-one, w = 0. Then 
\|Un||~! Aun —> 0, and we obtain a contradiction since ||, ||~!Au, has norm |A| for all 
n. 

3. It was shown in Section 4 that the set of Hilbert—-Schmidt operators is a normed 
linear space with norm || - || 75. Since ||L|| < ||Lllys, amy Cauchy sequence {L,,} in 
this space is Cauchy in the operator norm. The completeness of the space of bounded 
linear operators in the operator norm shows that {L,,} converges to some L in the 
operator norm. In particular, lim,(L,u,v) = (Lu, v) for all u and v. By Fatou’s 
Lemma, 


lus = do, Lujll? = ¥7 lim inf, || Lauj|l 


< liminf, >; ||Lnujl|? = liming, | La llys- 


The right side is finite since Cauchy sequences are bounded, and hence L is a Hilbert— 
Schmidt operator. A second application of Fatou’s Lemma gives 


Lm — L lus = Dj (Lm — L)ujll? = D0; liming, || (Lm — Ln)ujll* 
<liminf, )); ||(Lm — Ly)ujll? = lim inf, Ln — Lallys: 


Since the given sequence is Cauchy, the lim sup on m of the right side is 0, and hence 
{Z,,} converges to L in the Hilbert-Schmidt norm. 

4. If L and M are of trace class, then )°; |((L + M)uj, u;)| < 30; (\(Lui, v;)| + 
\(Mu;, v;)|) < llEllac + I|M|lp¢. Taking the supremum over all orthonormal bases 
{u;} and {v;}, we obtain the triangle inequality. 

5. Once we know that Tr(AL) = Tr(LA), then Tr(BLB~!) = Tr(B~!(BL)) = 
Tr(L). To prove that Tr(AL) = Tr(LA), fix an orthonormal basis {u;}. The formal 
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computation is 


Tr(AL) = eae (ALu,;, uj) = eF (Lu;, A*uj) = yi a (Lu;, uj) (A*u;, Uu;) 
= D3 ; (Au;, uj)(L*uj;, uj) = ye 2 (Au;, uj)(L*uj,, uj) 
=> NS; (Au;, L*u;) => ; (LAu;, Uu;) => Tr(LA), 


and justification is needed for the interchange of order of summation within the second 
line. It is enough to have absolute convergence in some orthonormal basis, and this 
will be derived from the estimate 
* 2) 1/2 * 2\1/2 
Yi, Ami, us) (L*u;, u)| < (0; Awe ul?) (0, |, us) 7) 
=); Aw iL*ull < WAI QU; 2*aill- 


The proof of Proposition 2.8, applied to L* instead of L, produces operators U and T , 
orthonormal bases {w;} and { f;}, and scalars 4; > 0 such that L* = UT, ||U|| < 1, 
Tw; = Jajw;, and >~|(L*wi, fi)| = >> (Tw;, wi). Taking u; = w;, we have 
|L*w;|| = |UTw;|| < Twill = “A; = (Tw;, w;). Hence for this orthonormal 
basis, }> ||L*w;|| < >> (Twi, wi) = >> |(L* uj, fi)|. The right side is finite since L* 
is of trace class. 


6. If v is a nonzero vector in the A eigenspace of Ly and if LgLy = LoL, then 
LyLp(v) = LgLa(v) = ALgv. Thus the 4 eigenspace of L, is invariant under Lg. 
We apply Theorem 2.3 to the compact operator Lg on each eigenspace of Ly, obtaining 
an orthonormal basis of simultaneous eigenvectors under Ly and Lg. Iterating this 
procedure by taking into account one new operator at a time, we obtain the desired 
basis. 


7. In (a), the operators L + L* and —i(L — L*) are self adjoint, and they commute 
since L commutes with L*. Compactness is preserved under passage to adjoints and 
under taking linear combinations, and (b) follows. 


8. If U is unitary, then U* = U~!. Then UU~! = I = U~!U shows that U 
is normal. Since U preserves norms, every eigenvalue 4 has |A| = 1. If U is also 
compact, then the eigenvalues tend to 0. Hence U is compact if and only if the Hilbert 
space is finite-dimensional. 


9. The solutions of the homogeneous equation are spanned by cos wt and sin wf. 
Then the result follows by applying variation of parameters. 


10. Take g(s) = p(s)u(s) in Problem 9. 
11. In (a), let t < ¢’. Then 


(Tht) — (Tht) = f" Kt, s) f(s) ds — f' Kt, s) f(s) ds 
=f" K(t,s)f(s)ds + fi [KW,s) — Ks) (s) ds. 
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The first term on the right tends to 0 as t’ — ¢ tends to 0 because the integrand is 
bounded, and the second term tends to 0 by the boundedness of f and the uniform 
continuity of K (t’, s) — K (t, s) on the set of (s, t,t’) wherea <s <t<t?’. 

In (b), for n = 1, we have |(Tf)(t)| = | fi Ks) f(s)ds| < Mf |f(s)| ds 
cy as required. Assume the result for n — 1 > 1, namely that [fF y@)| 
GamcM"™ (tt — ay. Then |(T"A/YO! = | [PKG Ne" PS) ds| 
M fi \r"'fy(s)lds < May Sue fi. (s—a)""? ds = Gop CM"(t — a)" 
Gop CM"(b— ay". 

In (c), the uniform convergence follows from the estimate in (b) and the Weierstrass 
M test. 


=IA IA IA 


Thus the n" term of the series is < 


12. The operator T is bounded as a linear operator from C([a, b]) into itself. 
Because of the uniform convergence, we can apply the operator term by term to the 
series defining u. The result is Tu = Tf + T7?f + T*f +--- =u — f. Therefore 
u—Tu= f. 


13. Subtracting, we are to investigate solutions of u — Tu = 0. Problem 11 
showed for each continuous u that the series u + Tu + T?u +--+ is uniformly 
convergent. If u = Tu, then all the terms in this series equal u, and the only way that 
the series can converge uniformly is if vu = 0. 
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1. Let Dj = 0/dy;. Let S be the vector space of all linear combinations of 
functions (1 + 47|y|?)~"hA with n a positive integer and h in the Schwartz space S. 
Then Dj((1+427|y|?)-"A) = —8nr? yj; (14427 |y|?)-@*PA+ 1 +-4a07 |y 2) Djh. 
The first term on the right side is in S because y;h is in S, and the second term on the 
right side is in S because Djh is in S. Thus S is closed under all partial derivatives. 
Since the product of a polynomial and a Schwartz function is a Schwartz function, S 
is closed under multiplication by polynomials. es the members of S are bounded, 
we must have S C S. In particular, (1 + 477|y|?)~!g is in Sif g isin S. 


2. Since the Fourier transform and its inverse are continuous, it is enough to handle 
pointwise product. Pointwise product is handled directly. 


—2x 
ye 


Dy (log((x? + yy) = aon These are also weak derivatives. In fact, use of 
polar coordinates shows that they are integrable near ©, 0), hence locally integrable 
on R?. If g is in CC®, (Q), we are to show that 1 log((x2-++y?)~!)D, g(x, y) dx dy = 
ee ey) » dx dy and similarly for y. For each y ¥ 0, the integrals over x are equal, 
and the set where y = 0 is of measure 0 in Q. The argument with the variables 
interchanged is similar. Thus log((x? + yee ') has weak derivatives of order 1. 


3. In (a), the ordinary sae derivatives are D, (log((x? + V4) = and 


on 
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In polar coordinates the p" power of |= =r ?|cos@|?, which is 
integrable near r = 0 relative tor dr for p < 2 but not p = 2. 

In (b), the argument for the existence of the weak derivative of log log((x? + oS 3) 
is similar to the argument for (a), the ordinary x derivative being 


@y) | 7. rPlcosél? 
+y2 | 7 r2p 


—2x 
(x? + y?) log((x? + y?)-!)" 


4 cos? @ 


In polar coordinates the square of this is ———.——— 
‘ ‘ r? log?(r~2) 


rdr. 


4. The idea is to use the Implicit Function Theorem to obtain, for each point of the 
boundary, a neighborhood of the point for which some coordinate has the property that 
the cone of a particular size and orientation based at any point in that neighborhood 
lies in the region. These neighborhoods cover the boundary, and we extract a finite 
subcover. Then we obtain a single size of cone such that every point of the boundary 
has some coordinate where the cone lies in (2. The cones based at the boundary 
points cover all points within some distance € > 0 of the boundary, and cones of half 
the height based at interior points within those cones and within distance €/2 of the 
boundary lie within the cones for the boundary points. The remaining points of the 
region can then be covered by a cone with any orientation such that its vertex is at 
distance < €/2 from all its other points. 


, which is integrable relative to 


5. For0 <a < N,|x|~%~ is the sum of an L! function and an L® function and 
hence is a tempered distribution. It is the sum of an L! function and an L? function 
for0 <a < N/2. 

6. The second expression is converted into the first by changing f into 1/t. The 
first expression is evaluated as the third by replacing t|x|? by s. 


7. The formula obtained from the first displayed identity is 


Sew (t|x[2)-2 4 PAN — a))\G(x) dx = fow GIx12)- 241 Ga) a) dx, 
which sorts out as 
m~2N-OT(1(N — a) fow XIX G(x) dx = 1~2°T (ha) few Lxl-% p(x) dx. 


8. In (a), we check directly that F(D“T) = (27i)'"E* F(T). Since T is in H’, 
Fav |IF)E)PA + [&?)° dé is finite. Now |g < |§| < (1 + 1&|°)!/? for every j, 
and hence |€%| < (1 + |€|?)*/? for |a| = s. Since (1 + |&|?)!/? > 1, (1 + [é|*)” is 
an increasing function of t, and thus |E*| < (1 + |&|7)°/? for |a| < s. Consequently 
(2mi)'*'E* F(T) is square integrable for |w| < s. Thus the Fourier transform of D°T 
is a square integrable function for |a| < s. By the Plancherel formula, D“T is a 
square integrable function for |a| <s. 
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Let T be the L? function f, and let D°T be the L? function 8a for |a| < s. 
The statement that f has gy as weak derivative of order a is the statement that 
Jen fD* pdx = (-1)"! fon Sap dx for y € C&.,(R%); this is proved for y = G 
by the following computation, which uses the polarized version of the Plancherel 


formula twice: 
(—1)!*! fon Saw dx = (—1)!" fay Qa) E* FPF) dé 
= fw FO Cae FQ ds = et FF Ow ds = fee f DPV dx 


Since f and its weak derivatives g, through |a| < s are all in L’, f isin L?(R¥), 
In (b), if T is given by an L? function, then F(T) = F(f) is an L? function. 
Hence F(T) is locally square integrable. We are assuming that D°T is given by an 
L? function 8a for |a| < s. The formula F(g,) = F(D°T) = (2mi)''E* F(T) shows 
that £*F(f) is in L? for Ja| < s. Now |€?|F(f))? = Do, |&FCP)/ and similarly 


EFA? = Dy cate Bh Ee FP = Date (ay ray) E*FC?. Hence 


pees dk SOLE OUR NM AE Set JOS K ND], .., 


and f isin H®. 
For (c), in one direction the argument for (a) gives 


If lez = Cates IDF Ilia = Dates mi)" FP Ilz2 
<= (Yipes 22) + EYP Fes < (Spee, 2) F ze 


In the other direction the displayed formula for (b), when integrated, gives 
If dee $5! Dyas 2A MD FIZ, < StF ll22. 
9. In (a), let T be in H*. Then the computation 
IT Wie = WA NEPYP FO Mie = F(A EPY PFO) Ifa = As) Ike 


shows that A, preserves norms. To see that A, is onto L”, let f be in L?. Then 
F(f) is in L? and hence acts as a tempered distribution. Then (1 + |€|*)~*/7F(f) 
is a tempered distribution also. Since F carries S'(R%) onto itself, T = 
F((1 + |&|?)-*/7F(f)) is a tempered distribution. This tempered distribution 
has the property that A,(T) = f. 

In (b), the relevant formula is that (A,)~!(g) = F7! (dl + IE|?)*/? F(p)). If g is 
in S(R%), then so is F(g). An easy induction shows that any iterated derivative of 
(1+ |€|?)~*/? is asum of products of polynomials in € times powers (possibly negative) 
of 1+|é|*. Application of the Leibniz rule therefore shows that any iterated derivative 
of (1 + |E|?)~*/*F(g) is a sum of products of polynomials in € times derivatives of 
F(g), all divided by powers of 1 + |&|?. Consequently (1 + |€|*)~*/*F(g) is a 
Schwartz function, and so is its inverse Fourier transform. 

For (c), we know that co (RY) is dense in L7(IR"), and hence S(R) is dense 
in L*(R%) also. Applying the operator (A,)~!, which must carry S(R”) onto itself, 
we see that S(R”) is dense in H°. 
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10. If T is in H~ and g is in S(R%), then the definition of Fourier transform on 
S(R*), together with the Schwarz inequality, implies that 


I(T, )| = WAT), F-"(@))| = | fow FIT (EF!) (E) dé | 
=| few (1 + EP) F(T) (E)ILG + (EI?) FED] | 
<0 +1EP PATIL I + EPP FOI Ole = IT gs llG ll 


11. For w in S(RY), we have (A(T), ¥)| = (7, FQY))| < ClIFOD Ilys = 
C( fox IFO E2G + [E29 dé)” = C( few WAP + lE2)° dé)!” = 


C\ly Il p2qew (-4IEI2)° a8)" Thus F(T) acts as a bounded linear functional on the dense 
vector subspace S(R) of L7(R%, (1 + |é|?)* dé). Extending this linear functional 
continuously to the whole space and applying the Riesz Representation Theorem for 
Hilbert spaces, we obtain a function f in L7(R, (1 + |é|*)° dé) such that 


(FT), +) = fan WEF + IEP) dE 


forall y in S(R™). Put Yo(E) = fE)A+1E17)°. Then fav |Wo(§)? +18?) dé = 
a |f(é)/7?C + |€|*)’ dé < 00, and the above displayed formula shows that F(T) 
agrees with the function wo on S(R”). Thus T is in H~S. To estimate ||T || ae 

: Ny; . a a 
we twice use the fact that S(R”) is dense: ||T||,,-. = II Moll > caw 4ter2y-s ae) = 


II fll 2 ae cr ayer2> ae) -_ SUP Is sian asaya = (AT), v)| = SUP |g\,,,<1 (T, @)|. 
Thus ||7'|| ,-. < C. 
12. In (a), we apply the Schwarz inequality: Pll sup < IF), = ||IF@|l, = 


fan WFC@)VENG + 1EP)P1LG + EPI-PI aE < UTylligs (fg [+ EP de)”. 

For (b), the last integral in (a) is finite for s > N/2. Thus we have IP llsup < 
C\lT oll ys for all g in S(R”). If T is in H*, we know from Problem 9c that we 
can find a sequence gy in S(R) such that Ty, tends to T in H*. For p < q, we 
then have ||g, — qIlsup <C\lT,, — Ty, ll, Letting g tend to infinity, we see that 
@p converges uniformly to some function f, necessarily continuous and bounded. 
Let Ty be the tempered distribution given by f. We show that T = Ty. If w 
is in S(R®), then F(y) is integrable, being a Schwartz function, and the uniform 
convergence of y, to f implies that (Ty, F(w)) = lim,({Ty,, F(y)). On the other 
hand, |(T7,, — T, Fw))| < WIT, — T lly llFC@)||,-., and thus (Ty,, F(y)) tends to 


(T, F(w)). Therefore (Ty, F(yr)) = (T, F(y)), and T = Ty. 
13. In (a), Py * (uo +ifuo)(x) = Py * ug(x) +iQy * ug(x) = ne * uUg(x) = 


\|z| 
((—izz)~!) * ug(x). The left side is in H? since H is bounded on L?”, and the form 
of the right side shows that the result is analytic in the upper half plane. Hence the 
expression is in H”. 
In (b), we know that f(x + iy) = Py * uo(x) +iQy * uo(x) = Py * ug(x) + 
iP, Huo(x). Taking the L? limit as y | 0, we obtain fo = uo +i Hug. Hence i Hug 
is the imaginary part of fo. 
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14. According to the previous problem, the functions in H? are those of the form 
Py * (uo +i Huo) with ug in L?. That is, they are the functions of the form up +7 Huo 
with wo in L*. The operator H acts on the Fourier transform side by multiplication by 
—isgnx. Hence the Fourier transforms of the functions of interest are all expressions 
to(x) +i(—isgn.x)uo(x) a.e. This function is 2%o(x) for x > 0 and is 0 for x <0. 
Conversely any function in L? is the Fourier transform of an L* function, and thus 
if g is given that vanishes a.e. for x < 0, we can find uo with % = 58. Then 
to + i(—isgnx)up = g. 
15. The first inequality is by the Schwarz inequality, and the second inequality is 
evident. For the equality we make the calculation 
—4ia 2\q/2 — 9, 2 2)4-18 
ACF |) =422(F)2)9? = 2g 21(FP IZ, FY] 
= a 2) 4-1 
= 2g ZF PI", FY 
4 q_ 
= qq — 2)(1F IP)? OF, FYE, F) + 2g FP) (FF) 
= @ FIA, PP — 2g FIA, POP + 2g FOP? 


= @Q |FIA(F, FOP + 2q| FIA 4 ( -1(F, FOP + FPF’). 


16. Arguing by contradiction, suppose that u(x,) > O with |x; — x9| <r. For 
any c > 0, the function v.(x) = u(x) + c(|x — xo\" —r*) has Av, > 0 on B(r; xo) 
and v = u < 0on 0B(r; x0). We can choose the positive number c sufficiently small 
so that v.(x;) > 0. Fix that c, and choose x2 in B(r; xo)! where vu, iS a Maximum. 
Then x2 is in B(r; xo), and all the first partial derivatives of v. must be 0 there. Since 
Av,(x2) > 0, we must have D?v, (x2) > 0 for some j, and then the presence of a 
maximum for v — x at x2 contradicts the second derivative test. 

17. For (a), we calculate |lgelIJ = felge(x)l?dx = JalFe(x)|dx < 
Sa lf@ +ie)|dx +e fale til < If ll gi tell@ +97). 

In (b), the functions x > g,(x +iy) and x b F;(x + iy) are Poisson integrals 
of the functions with y replaced by y/2, and then are iterated Poisson integrals in 
passing from y/2 to 3y/4 and to y. In the first case the starting function is in 
L?, and in the second case the starting function is in L'. The function at 3y/4 is 
then in L? since L! * L? C L?, and the function at y is continuous vanishing at 
infinity since L? * L* C Co(R). This handles the dependence for large x. For 
large y, we refer to the proof of Theorem 3.25, where we obtained the estimate 
u(x,t)? < Lt) N*1 Qi + Leollull?,, if wis in H? and t > to. 

In (c), the functions | F(z) | 1/2 and &-(Z) are equal for z = x. Hence the continuous 
function u(z) = | Fe(z)|!/? — g-(z) on Re vanishes at y = 0 and tends to0 as |x|+|y| 
tends to infinity. Given 6 > 0, choose an open ball B large enough in R2 so that 
u(z) < 6 off this ball. Since the second component of F;(z) is nowhere vanishing, 
| Fe (z)|!/2 is everywhere smooth for y > 0. Problem 15 shows that A(|F; (z)|!/2) > 0, 
and we know that Ag,(z) = 0 since g, is a Poisson integral. Hence Au(z) > 0. 
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Applying Problem 16 on the ball B, we see that u(z) < 6 on B. Hence u(z) < 6 on 
R%.. Since 4 is arbitrary, w(z) < 0 on R?.. Therefore | F.(z)|'/? < ge(z) on R4. 


18. In (a), the fact that P, is in L? implies that lim, Tie Py(x — tg, (t)dt = 
Ve Py(x — t)g(t)dt. Thus g,,(z) > g(z) pointwise for Imz > 0. Then we have 
|f(z)|'/? < limsup, | f(z + ien)|'/? < limsup, ge(z) = g(z). Since g(z) is the 
Poisson integral of g(x), the inequality g(x +iy) < Cg*(x) is known from the given 
facts at the beginning of this group of problems. 

In (b), we have | f (x+iy)| < C?g*(x)?,and we know that || ¢* ll, < Aallgll,. From 
Problem 17a we have IIgs < lim sup, lIge, lI5 < limsup,, (halen +el\(x+i)~? |] = 


ae 


19. Every f in Ccom(X) has | fy f(x) dv(x)| = lim, | fy f@)gn(x)du)| < 
lim sup,, fe If Oilgn@)|ducx) < jy | f (x)|du(x). If K is compact in X, we can 
find a sequence { f;,} of functions > 0 in Coom(X) decreasing pointwise to the indicator 
function of K , and dominated convergence implies that | L ca v(x)| < fi x dex). In 
other words, |v(K)| < w(K). Separating the real and imaginary parts of v and then 
working with subsets of a maximal positive set for v and a maximal negative set for 
v, we reduce to the case that v > 0. Since v is automatically regular, we obtain 
v(E) < w(E) for all Borel sets FE, and the absolute continuity follows. 


20. Since f is in H!, it is in H! and hence is the Poisson integral of a finite 
complex Borel measure v, and the complex measures f (x +i/n) dx converge weak- 
star against Ceom(R) to v. Meanwhile, we have | f(x + i/n)| < C*g* (x)? for 
all n. In Problem 19 take du(x) = C79" (x)? dx. Then the complex measures 
fat i/n)[C¢* (x)?]7! d(x) converge weak-star to v. Problem 19 shows that v is 
absolutely continuous with respect to C*g*(x)* dx. Hence v is absolutely continuous 
with respect to Lebesgue measure. 


21. For (a), F(T ¢@) is the product of an L® function and a Schwartz function. The 
rapid decrease of the Fourier transform translates into the existence of derivatives of 
all orders for the function itself. Hence © is locally bounded. 

For (b), any x with |x| > 1 has 


D(x) =limyyo fiyyne (ESR — Gr) 90) dy. 


Hence |®(x)| is 


< limsupy jo fy >, PO)IK ior ar | dy + fow y(y) Kan Us KOl dy, 


If |x| = 2|y| for all y in the support of g, two estimates in the text are applicable; 
these appear in the proof that the hypotheses of Lemma 3.29 are satisfied: 


ae a and = |K(x—y) — K(x) < (24). 


[x|N¥I |x| 


l= ae _ 


The smoothness of K makes y(t) < Ct for small positive t. Since the y’s in question 
are all in the compact support of yg, both terms are bounded by multiples of |x|~“*). 
Conclusion (c) is immediate from (a) and (b). 
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22. Part (a) is just a matter of tracking down the effects of dilations. Part (c) 
follows by dilating ® = Tg —k to obtain ®, = (T¢), — kz, by applying (a) to write 
®, = Tg, — k;, by convolving with f, and by applying (b). Thus we have to prove 
(b). 

For (b), we have g, * Tf = @, * (lim; 7; f). The limit is in Z?, and convolution 
by the L”’ function y, is bounded from L? to L®. Therefore gy, * (lim; T; f) equals 
lims(¢, * (Ts f)) = lims(@, * (ks * f)). This is equal to lims((g, * ks) * f) = 
lims((Ts@-) * f) since g, is in Le Finally we can move the limit inside since 
lim; T;y, can be considered as an L” limit and f isin L?. 

23. From (c), we have sup,..9 |T: f (x) | = sup,so lke * f(x)| < sup, |Pe * f (X)| 
+ sup.o [Ge * (Tf)(x)| < Co f*(x) + Co(Tf)* (x), where Co and Cy are as in the 
given facts at the beginning of this group of problems. 

24. Taking L? norms in the previous problem and using Theorem 3.26 and the 
known behavior of Hardy—Littlewood maximal functions, we obtain 


| sup IT. F@)I|], $ Col fly + Col(TA)"Ilp $ CoApIFllp + CoApITS Ip 
< CeApllfllp + CoApCollfllp = Clif, 


where A, and C, are constants such that If ll, < Apll fil, and ITFIl, < Collfl,- 
We know that lim,.o 7; f (x) exists pointwise for f in the dense set C°° (R®), anda 


com 
familiar argument uses the above information to give the existence of the pointwise 


limit almost everywhere for all f in L?. 

25. This follows from the same argument as for Proposition 3.7. 

26. Fix w > Oin ce (RY) with integral 1, and define w,(x) = eine" %), If 
f isin L2(T%), then w, * f is smooth and periodic, hence is in C~(T™). Suppose 
it is proved that 

D° (We * f) = We* D°f for |a| <k. (*) 

If we let 7 be the indicator function of [—27, 27], then Proposition 3.5a shows 
that lim,)o |In(Wve * D“f — D*f))||, = O for |a| < k, and then (*) shows that 
limejo [In(D" he * Ff) — D°F)Ilp = 0. Hence limeyo [ve + F — Fl page = 9- 

For («), the critical fact is that the smooth function y « f is periodic. If @ is 
periodic and y, is supported inside [—7r, 1], then 


Jen (We D°f()) px) dx = fig aw Sig nw Ve) Df (& — y)o(x) dy dx 
= faa Ieamw Ve) DF (& — yee) dx dy 
= (1)! fw tema WeOS@ — y)D* p(x) dx dy 
= (-1)"| fea nw We * f))D* p(x) dx 
= finn (D8 (We * fp ax, 


and (x) follows. 
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27. We have 


a 2 A a 
D°F paces, = Lipice 2" fin yw [DPD fP dx 
= Lipice 20% fen IDO? FI? ax 

SL ipcerte| 20 fia ny ID’SP dx 


— 2 
= IF my 


Thus we can take Cy, = 1. 


28. For each a, we have (277)~" fy_.. yw |D% FP dx < (Supy etn xy [D%f (@)))”. 
Summing for |a| < k gives 


2 
IA lpacrvy S Ljajck (SUP re[—m.n}" [D°f (x)I)”, 


and the right side is < iat SUP ye{—ayx |D“f(x)|)*. Thus we can take A, = 1. 

29. Since 1? < |I|?, we have 17” < (({/|*)!*! < (1 + |/|?)*, and the left inequality 
of the problem follows with B, equal to the reciprocal of the number of @’s with 
|a| < k. For the right inequality, we have 1 + |J|? = are 1°”, Raising both sides 
to the k"" power gives the desired result once the right side is expanded out since 
[2% 128 = [20+B) , 

30-31. For f in C~(T), let f have Fourier coefficients c;. The J“ Fourier 
coefficient of D%f is then i!*'/%c;, and hence || D°f ||} = >, |ci|7/°*. Consequently 
| Fllzqrny = slel( Pouce /?“). Then the estimate required for Problem 31 in the 


case of functions in C©(T”) is immediate from the inequalities of Problem 29. 
Problem 26 shows that C°(T’) is dense in L2(T). Let f be givenin L?(T), and 
choose f) in C°(T) convergent to f in L2(T%). Since f tends to f in L’, the 
Fourier coefficients em of f™ tend to those c; of f for each/. Applying Problem 29 
toeach f) and using Fatou’s Lemma, we obtain ler >(1 + |1|*)* < Cy II FI? 


L2(TN)* 
On the other hand, if f is given in Bra as ) with Fourier coefficients c;, then we 
can put f(x) = Yy., cre*. Since f is given by a finite sum and since 


Df (x) = 2, cl%e!!* in the L? sense for |a| < k, we see that f converges to 
f in L2(T%). The left inequality of Problem 31 holds for each f) since f™ is 
in C~(T), and the expression in the middle of that inequality for f is < the 
corresponding expression for f. Passing to the limit, we obtain the left inequality of 
Problem 31 for f. 

This settles Problem 31. It shows also that if f is in LAT ), then we have 
> ler? + Z|?)* < co. On the other hand, if this sum is finite, then we define f 


tobe )>\,,-, cre""*. Problem 31 gives us B,|| f” ask < ler? +1 |7)* for each 
k 


n. Each D“f™ for |a| < k is convergent to something in L7, and the completeness 
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of L2(T) proved in Problem 25 shows that f”) converges to something in L2(T). 
Consideration of Fourier coefficients shows that the limit function must be f. Hence 
f isin L207), 

32. Putc = K/N > 1/2. Term by term we have )7)-gw(1 + |l|?)- Ot)? < 
Wren’ Dyer GFE C4? = TT (Ene (1 +m)~*), and the right 
side is finite since c > 1/2. This proves convergence of the sum. 

Now suppose that f is in jie (7), and suppose that f has Fourier coefficients c;. 
Problem 31 shows that 5°, le:|?(1 + |1|?)* < co. The Schwarz inequality gives 


lel = YX ler + WD* 1 + (eP)-*? 
<(XylePd + P98) a+ UP kY"”, 


and we conclude that }~ |c;| < oo. Therefore the partial sums of the Fourier series 
of f converge to a continuous function. This continuous function has to match the 
L? limit almost everywhere, and the latter is f. 

33. Let c; be the Fourier coefficients of f. If f is in ie (T’) with K > N/2, 
then Problem 32 shows that f is continuous and is given pointwise by the sum 
of its Fourier series. The inequalities in the solution for that problem show that 

—K\1/2 
If@I < Dylel < Ac(DylePa + wPy-*)" 
the right side is < AC,!"|I fll,» rs: This gives the desired estimate for a = 0 with 
m(O) = K for any integer K greater than N/2. Combining this estimate with the 
result of Problem 27, we obtain an inequality for all a, with m(a) = K + |a| and 
1/2 
Cy = AkC,K. 

34. The comparisons of size are given in Problems 28 and 33. These comparisons 
establish the uniform continuity of the identity map in both directions, by the proof 
of Proposition 3.2. (The statement of the proposition asserts only continuity.) 


. In turn, Problem 31 shows that 


Chapter IV 


1. With the explicit definition of the norm topology on X/Y, we have ||x + Y|| < 
||x ||, and consequently the quotient mapping g : X — X/Y is continuous onto the 
normed X/Y. Because of completeness the Interior Mapping Theorem applies and 
shows that the quotient mapping carries open sets to open sets. Consequently a subset 
E of X/Y inthe norm topology is open if and only if g~!(E) is open. This is the same 
as the defining condition for a subset of X /Y to be open in the quotient topology, and 
hence the topologies match. 

2. Let K = ker(T), and let g : X — X/K be the quotient map. By linear 
algebra the map T : X — Y induces a one-one linear map T’ : X/K — Y, and 
then T = T’ oq. Since K is closed in X, Proposition 4.4 shows that X/K is a 
topological vector space. Since T(X) is finite dimensional and T’ is one-one, X/K 
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is finite dimensional. Proposition 4.5 implies that T’ is continuous. Since T is the 
composition of continuous maps, it is continuous. 


3. Let T : X — Y beacontinuous linear map from one Banach space onto another, 
and let K = ker7. As in Problem 2, write T = T’ oq, where gq : X > X/K is 
the quotient mapping. Here T’ is one-one. Since a subset E of X/K is open if and 
only if g~!(E) is open, T’ is continuous. Problem 1 shows that the topology on X/K 
comes from a Banach space structure. By the assumed special case of the Interior 
Mapping Theorem, T’ carries open sets to open sets. Therefore the composition T 
Caries open sets to open sets. 


4. This follows from Proposition 4.5. 

5. Take x, to be the n' member of an orthonormal basis. Then ||x,|| = 1 for all 
n. Any u in 7 has an expansion u = var CnXn, convergent in H, with c, = (u, Xn) 
and }* |cn |? < oo. Then {(u, x,)} tends to 0 for each u, and {x,,} therefore tends to 0 
weakly. 

6. The weak convergence implies that lim,(f,, f) = (f, f) = || f|I?. Therefore 
fn — FI? = fall? — 2Re(fn, f) + IF II? tends to | fl? — 21f 1? + IF? =0. 

7. Let the dense subset of X* be D. For x* in X* and y* in D, we have 


Ix" (n) — x*@o)| S 1@* — y*)@n)| + ly" Gn) — y* 0) + 1O* — x*) G0) | 
< Ix" — y*Wllanll + Ly*@n) — y*@o)| + Ix* — y*Illlzoll 
< (C + llxoll)llx* — y*ll + ly*@n) — y* @o)I, 


where C = sup, ||x,||. Givenx* € X* ande > 0,choose y* in D to make the first term 
on the right be < €, and then choose n large enough to make the second term < €. 


8. For (a), let D(f) = 1. Thent Sto. | f|? dx is a continuous nondecreasing 
function on [0, 1] that is 0 at t = O and is | att = 1. Therefore there exists a 
partition 0 = ay < aj <--- < a, = 1 of [0, 1] such that S0.a) |f|? dx = j/n for 
0 < j <n. If f; for j = 1 is the product of n and the indicator function of [a;_1, aj], 
then D( fj) = +n? =n-"-P) and f = 4( fi +---+ fa): 

For (b), let g; = cf; in(a),so that D(g;) = |c|? D(f;) = |e|?n~-"-. If we pute = 
n!-P)/P then D(g;) = 1. Thus we obtain the expansionn~?)/? f = 1 (gi +: ++) 
with D(g;) = 1 for each j. Since D(n-?)/? f) = n'!-? D(f) = n'-?, the multiple 
nl-P)/P f of f is a convex combination of functions h with D(h) < 1. Taking a 
convex combination of 0 and this multiple of f shows thatr f is a convex combination 
of functions h with D(h) < 1if0 <r < n“-P/P, Since sup, nl-P)/P — +00, every 
nonnegative multiple of f is a convex combination of functions h with D(h) < 1. 

For (c), we scale the result of (b). The smallest convex set containing all functions 
e!/Ph with D(h) < 1 contains all nonnegative multiples of f. Since D(e!/Ph) = 
eD(h), the smallest convex set containing all functions k with D(k) < ¢ contains all 
nonnegative multiples of f. Since f is arbitrary, this convex set is all of L”([0, 1]). 
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For (d), the sets where D(f) < ¢ form a local neighborhood base at 0. Thus if 
L?({0, 1]) were locally convex, then any convex open set containing 0 would have 
to contain, for some ¢ > 0, the set of all f with D(f) < ¢. But the only convex set 
containing all f with D(f) < e is all of L?((0, 1]) by (c). Hence L?((0, 1]) is not 
locally convex. 

For (e), suppose that £ is a continuous linear functional on L’?([0, 1]). Then we 
can find some ¢ > 0 such that D(f) < e implies Re €(f) < 1. The set of all f where 
Re £(f) < 1 is a convex set, and it contains the set of all f with D(f) < e. But we 
saw in (c) that the only such convex set is L?((0, 1]) itself. Therefore Re €(f) < 1 
for all f in L?([0, 1]). Using scalar multiples, we see that Re €(f) = 0 for all f. 
Therefore £(f') = 0, and the only continuous linear functional @ on L?([0, 1]) is 
£=0. 


9. In (a), if g is compactly supported in K,,,, then g! SUP ¢K, SUPla|<m, |D*o(x)| 
is O for p = po. Thus ||¢||,,,. 18 asupremum for p < po of finitely many expressions 
that are each finite for any smooth function on U. Hence ||¢||,,,, is finite. Conversely 
if g is not compactly supported, then the expressions s, = supy¢ K, |p(x)| have 
0 < sy < o forall p. If we define the sequence ¢ by e, = min(p~!, Sp), then €, 
decreases to 0 and every sequence m has ||9|_.- = a SUP gx, |g(x)| = p for all p. 
Since p is arbitrary, ||¢]|,,. = ©. 

For (b), we have only to show that the inclusion of C R into (CS.,(U), T’) is 
continuous for every p. If (m, €) is given, we are to find an open neighborhood of 0 in 
Ce such that ||¢]|,,,, < 1 forall g in this neighborhood. Put M = max(m,..., mp) 
and 6 = min(¢1,..., @p). If g is supported in K, and sup, <x, SUPjgj<y |D°9()| < 
§, then e~! SUPy¢K, SUPje|<m, [D“p(x)|1s0 forr > pandis < 1 forr < p. Therefore 


its supremum on r, which is ||¢]| is < 1. 


m,&? 

For (c), define m, = max{p,nj,...,np} for each p, and then {m,} is monotone 
increasing and tends to infinity. Next choose C, for each p by the compactness of the 
support of y, and the use of the Leibniz rule on y,7 so that whenever |D%n(x)| < ¢ 
forsomen € C°(U),allx ¢ Kp,and all a with |a| < my, then grat |D* (Wpn)(x)| < 
C_c for that 7, all x € U, and all a with |w| < m,. Choose ¢, to be < 6,/C, and 
to be such that {¢,} is monotone decreasing and has limit 0. If ||gIl,,,.. < 1, then 
SUP gx, SUPja|<m, |D“G()| < €p for all p. Taking 1 = ¢ in the definition of Cp, we 
see that sup, <y SUP ja|<m, 2°41) D* (Wy) (x)| < Cp€p < dp. Since pg is in Ce, 
and m, > np, we see that ptt Wye meets the condition for being in NM CRs 


For (d), we see from (c) that 2+ lw is in N for all p > 0. The expansion 
g= ae, POMPE oO) is a finite sum since g has compact support, and it 
therefore exhibits g as a convex combination of the 0 function and finitely many 
functions 27+ wh, each of which isin NV. Since N is convex, g isin N. This proves 
the asserted continuity. 

For (e), each vector subspace C R is closed nowhere dense, and the union of these 
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(oe) 


subspaces is all of C3,,(U). 


10. Disproof: The answer is certainly independent of H,, and we can therefore 
specialize to H = L?((0, 1]). The multiplication algebra by L™([0, 1]) is isometric 
to a subalgebra of B(H, H) and is not separable. Therefore B(H, H) is not separable. 

11. Certainly A’ > M(L?(S, )). Let T be in A’, and put g = T(1). For 
f continuous, Tf = T(fl) = TMyl = MrT1 = Mrg = fg = gf. If we 
can prove that g is in L°(S, ), then T and M, will be bounded operators equal 
on the dense subset C(S) of L?(S , 4) and therefore equal everywhere. Let Ey = 
{x | N < |g(x)| < N + 1}, and suppose that w(Ey) > 0. We shall derive an 
upper bound for VN. Choose a compact set Ky C Ey with w(Ky) > 0. Then 
choose f in C(S) with values in [0,1] such that f > 1 on Ky and {ef-Oe < 
2u(Kw). Then fy|gfl?du > fx, Igfldu = fx, |g? du = N?u(Kw). Also, 
Ss\fP du < fo fdu < 2u(Ky) since 0 < f < 1. Therefore Nu(Ky)'/? < 
Iisfll, < ITI Fil, < V2 IT lle Kwy)'”, and we obtain N < /2||T||. This gives 
an upper bound for N and shows that g is in L®(S, j2). 


12. The Spectral Theorem shows that we may assume that A is of the form M, and 
acts on H = L7(S, 4), with gin L©(S, ww). Certainly we have SUP) f\\,<1 (Mef, fl 
< ||g|loo. Let us prove the reverse inequality. Lemma 4.55 and Proposition 4.43 show 
that ||g||,, 1s the supremum of the numbers |Ao| such that Ao is in the essential image 
of M,. For Ag in the essential image, fix « > 0 and let f| be the indicator function of 
g !({JA — Aol < €}). Then 


Ss8lfil? de = fioc-rgice 8 4M = Aom(18(X) — Aol < €)+ Srox rol ce (8 —A0) dH. 


The last term on the right is < e(|g(x) — hol < €) in absolute value. Hence 
Isglfildu = (Ao + S)u(Ig@) — dol < €) with [| < €. Dividing by | fill} = 
1(lg(x) — Aol < €) and setting f = f,/I| filly, we obtain | J; gf |? du — Ao] <€. 
Since € is arbitrary, Ag is in the closure of {(Mef, ft) | If ll, = 1}. Taking the 
supremum over Ao in the essential image, we obtain SUP | f\,<1 (Me f, f)| = Ilglloo- 


13. This is what the proof of Theorem 4.53 gives when the assumption that A is 
maximal is dropped and the cyclic vector is produced by a hypothesis rather than by 
Proposition 4.52. 


14. Apply the previous problem. Proposition 4.63 shows that A*, is canonically 
homeomorphic to o(A). Under this identification we want to see that U AU mais 
multiplication by z. Thus let y : o(A) > A%, be the homeomorphism obtained from 
the proposition. The solution of the previous problem and the proof of Theorem 4.53 
show that U AU~! is multiplication by A when we work with Aj,, and it is therefore 
Aow when we work witha (A). The defining property of y is that f(z) = fo A(w (z)) 
for f € C(o(A)) and z € o(A). This equation for the function f(z) = z says that 
Ao w(z) =z,and hence UAU~! is multiplication by z on o (A). 
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15. For (a), A immediately contains all Mp for arbitrary polynomials P with 
complex coefficients on [0, 1]. By the Stone—Weierstrass Theorem, A contains all 
operators My with f continuous on [0, 1]. This collection of operators is an algebra 
closed under adjoints and operator limits (which are the same as essentially uniform 
limits of the functions), and hence it exhausts A. If we then form A1, we obtain all 
continuous functions in L?({0, 1]), and these are dense. Hence 1 is cyclic. 

For (b), Proposition 4.63 says that the spectrum may be identified with o(M,), 
and Lemma 4.55 shows that this is [0, 1]. 

In (c), the system of operators M, satisfies conditions (a) through (d) for the 
system ~(M,) of Theorem 4.57. By uniqueness, y(M,) = Mg, for every bounded 
Borel function on [0, 1]. 


17. If 0 < uw(S) < 1, then p is a nontrivial convex combination of 0 and a 
measure with total mass | and is therefore not extreme. Since 0 is evidently extreme, 
the problem is to identify the extreme measures among those with total mass 1. If 
lu is given with w(S) = | and if some Borel set EF has O < w(E) < 1, define 
(A) = w(E)~! (EN A) and w2 = w(E°)~!w( Eo A). Then 1 and p12 have 
total mass 1, and the equality wu = w(E)u, + w(E°) 2 shows that jz is not extreme. 

Thus we may assume that jz takes on only the values 0 and 1. In this case the 
regularity of jz implies that jz is a point mass, as is shown in Problem 6 of Chapter XI 
of Basic. 


18. For (a), we have f = (1 — Ai fAlly A + tll folly fe with t = || foll,. For 
(b), we observe for any f in L!((0, 1]) with fll, = 1 thatt tio A | fldx is 
continuous on [0, 1], is 0 at t = 0, and is 1 at tf = 1. Therefore there exists some fo 
with Sto] |fldx = 5: The set EF = [0, to] is then a set to which we can apply (a) to 
see that f is not an extreme point of the closed unit ball. 


19. For the compactness of K in (a), we are to show that the set of invariant 
measures is closed. Such measures jz have te fdu= le (foF) dw forall f ¢ C(S). 
If we have a net {z,,} of such measures convergent weak-star to yw, then we can pass 
to the limit in the equality for each jz, and obtain /, f du = 5 (f o F) dy for the 
limit yz since f and f o F are both continuous. If we define v(E) = (Fo! (E)), this 
equality says that te fdp= te f dv for every f € C(S). By the uniqueness in the 
Riesz Representation Theorem, = v. Therefore the limit jz is invariant under F. 

In (b), if 2 could be extreme but not ergodic, we could find a Borel set EF with 
0 < w(E) < | suchthat F(E) = E. Put w,(A) = U(E)~! (A OE) and 2(A) = 
U(E*)|w(A 1 E°). The invariance of the set E implies that 2; and jz are invariant. 
Since w = W(E)u + WE‘) 2, we is exhibited as a nontrivial convex combination 
of invariant measures and cannot be extreme. 

For (c), the answer is “no.” Take S to be a two-point set with the discrete topology, 
and let F interchange the two points. Then every measure jz on S with w(S) = | is 
ergodic, but only the two point masses are extreme points. 


20. For (a) the assumed condition on f for the function c(7) that is nonzero at 
n = 0 and is 0 elsewhere shows that f(0) > 0. The condition on f for the function 
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c(n) that is nonzero at 0 and k and is 0 elsewhere is that the matrix ( FO) FM) ) i 


f(-k) FO) 
Hermitian and positive semidefinite. The Hermitian condition forces f (—k) = f (k), 
and the condition determinant > 0 then says that | f (x) \7 af (0). 

For (b), Example 2 of weak-star convergence in Section 3 says that a necessary 
and sufficient condition for a sequence { f,,} in L°° to converge to f weak-star is that 
{Il fmll 9} be bounded, which we are assuming, and that [, frdu > J, f du for 
every E of finite measure. Here the sets of finite measure in Z are the finite sets, and 
thus the relevant convergence is pointwise convergence. 

For (c), Theorem 4.14 shows that the weak-star topology on the closed unit ball of 
L°®(Z) is compact metric, and therefore the topology is specified by sequences. The 
convexity of K is routine, and we just have to see that K is closed. We can do this 
by assuming that we have a pointwise convergent sequence whose members are in K 
and by proving that the limit is in K. This too is routine. 

For (d), suppose that e’”° = (1—1) F,(n) +1 F2(n) nontrivially. Taking the absolute 
value and using (a), we have 1 < (1 —1¢)|Fi@)|+¢|/Fo(@)| < G —t)++t = 1, and 
equality must hold throughout. Therefore | F1(1)| = |F2(n)| = 1. Suppressing the 
parameter 1, suppose that we have e!” = (1 —r)e!?! + te'” nontrivially. Multiplying 
through by e~'” , we reduce to the case that y = 0. So we have 1 = (1—r)e!%: +14e!”. 
The real part is 1 = (1 —f) cosy +t cos g), and we must have cos gy; = cos g, = | 
and e'% = e'% = 1. Hence F\(n) = e'”? = Fy(n), andn +> e'”® is an extreme 
point. 

For (e), the Fourier coefficient mapping from complex Borel measures on the circle 
to doubly infinite sequences is linear and one-one, and we are told to assume that the 
mapping carries the set of Borel measures onto the set of positive definite functions. 
The value of the positive definite function at 0 is then the total measure of the circle. 
Hence the question translates into identifying the extreme Borel measures of total 
mass | on the circle. Problem 17 shows that these are the point masses. 

21. For (a), the convergence is proved by showing that the partial sums form 
a Cauchy sequence. For m < n, we have || )°_o(f/C)* — Dig (f/C)* ee = 
(epee ey les < Viet IF/Clkup, and the right side tends to 0 as m and 
n tend to infinity because || f/C ll sup = |c|7! II F llsap < 1. So the series converges to 
some x. Since (> -0(F/C)) — f/C) =1—(f/C)"*! and since multiplication 
is continuous, the element x is a multiplicative inverse to 1 — f/C. 

In (b), £(f) = C would imply @€(1 — f/C) = €(1) — £(f)/C = 0. But then 
0=0- €(x) = 20 — f/C)L(x) = £1) = 1 would give a contradiction. 

From (b) we obtain |£(f)| < 1. Taking the supremum over all f with II F llsup <1, 
we find that ||@|| < 1. Thus ¢ is bounded. This proves (c). 

22. Problem 21 shows that £ is bounded. The result follows by using the Stone 
Representation Theorem and the first example after its proof. 

23. Ift isin T, define l,((f) = (Uf) (¢t) for f in C(S). It is routine to check that 
£, satisfies the hypotheses of Problem 22 and is therefore given by evaluation at some 


422 Hints for Solutions of Problems 


sin S. Define this s to be u(t). The proofs of (a), (b), and (c) are then straightforward. 


24. This is just a matter of applying Problem 23 and tracking down the isomor- 
phisms. 


25. Let S be a nonempty set, and let A be a uniformly closed subalgebra of B(S) 
with the properties that A is stable under complex conjugation and contains 1. If S2 is 
acompact Hausdorff space and V : A — C(S2) is an algebra isomorphism mapping | 
to 1 and respecting conjugation and if S,, p, and U are as in Theorem 4.15, then there 
exists a unique homeomorphism ® : S$, — Sj such that (Uf)(®(s2)) = (Vf)(s2) 
for all f in A. Then one has to give a proof. 


26. For (a), the reflexive and symmetric properties are immediate from the 
definition. For the transitive property let x; ~ x; and x; ~ x;. Say thati < k, 
JS kK, Wii) = Wey), jf < mil < m, Wnj%j) = Wm). Choose n with 
k <nandm <n. Application of Wnx to Wei (xi) = Wej Xj) gives Wi (Xi) = Wnj (Xj), 
and application of Wam to Wnj(%j) = Wmi(x1) gives Wnj (xj) = Wni(x)). Therefore 
Wni (Xj) = Wr), and ~ is transitive. 

For (b), suppose that yy; (x;) = Wj (xj). We are to show that w(x) = Wij (xj) 
whenever i < / and j </. Assume the contrary for some /. Choose m with k < m 
and / < m. Application of png to Ye (Xi) = Wei (Xj) gives Wini (Xi) = Winj (xj). On 
the other hand, application of Wy to Wy (xi) A Wj (xj) gives Wixi) A Vn (x) 
since W,,; is by assumption one-one. Thus we have a contradiction. 

27. Suppose that we are given maps ¢ : W; > Z with gj 0 Wi = Gi whenever 
i < j. Define ®: [| Wi — Z by ®(x;) = = 9j;(x) if x; is in W;. The map ® 
is continuous, and the claim is that it descends to the quotient to give a map ® 
satisfying O(q(x;)) = ®(x)). To see the necessary consistency, suppose xj ~ x) 
with x; in W;. Say that j < k,/ < k, and yy; a) = = Wu(x,;). Then we have 
B(xj) = Gj (xj) = Geta (X)) = Gear) = Gr(X1) = G(x), and the consistency is 
proved. The definition of ® is complete, and we have arranged that ® o (¢ | w= Yi 
foreach j. This establishes existence of the map © in the universal mapping property. 
Since g carries [[, W; onto W, the formulas ® o |) = @; force the definition we 
have used for ®. This establishes the uniqueness of the map © in the universal 
mapping property. 

28. With (V, {p;}) as a direct limit, take Z = W and g; = q;. Each map 9g; 
carries W; into Z, and the universal mapping property of (V, {p;}) yields a mapping 
F:V — Wwithgq; = Fo p; foralli. Reversing the roles of (V, {p;}) and (W, {g;}), 
we obtain a mapping G: V > W with p; = G oq; for alli. 

With (V, {p;}) as a direct limit, take Z = V and g; = p;. Then the identity 1]; 
meets the condition of the universal mapping property for this situation. On the other 
hand, so does Go F ,, which carries V to itself and has pj = Gog; = (GoF)op;. By 
the uniqueness that is part of the universal mapping property, Go F = | bees Similarly 
F oG=1|],,. Thus F is a homeomorphism. 

The homeomorphism F is unique because any such mapping F* must similarly 
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have Go F# = |, and F#foG= Elis: Thus F* must be a two-sided inverse to G, 
and there can be only one such function. 


29. For (a), let U be an open set in LI; W;. We are to prove that g(U) is open. 
Since each W; is open in the disjoint union, we may assume that U C W; for some i. 
We are to prove that g~!(q(U)) is open, hence that g~!(q(U)) N W; is open for each 
j. Thus we are to show that the set V of all x; in W; such that x; ~ x; for some x; in 
U is open in W;. Choose k withi < k and j < k. Then we have V = Wig (Wi U)). 


The hypothesis for this problem makes ¥;(U) open in W;, and then ie (Wi (U)) is 
open since W;; is continuous. 

For (b), we are to separate q(x; ) and g(x;) by disjoint open sets if x; and x; are not 
equivalent. Choose k with i < k and j < k, so that wy;(x;) and Wx; (x;) are both in 
W,. They are distinct in W; by Problem 26b. Since W; is Hausdorff, we can choose 
disjoint open sets A and B in W, with yj; (x;) in A and y,;;(x;) in B. Then qg(A) and 
q(B) are disjoint since g is one-one on W,, and they are open by (a). 

For (c), the mapping into the direct limit is continuous and open and therefore 
carries compact neighborhoods to compact neighborhoods. Since the quotient map 
is onto the direct limit, every point of the direct limit has a compact neighborhood. 

For an example in (d), take W; = {1,...,i} for each 7, with y;; equal to the 
inclusion if i < j. Each W; is finite, hence compact, and the direct limit is the set of 
positive integers with the discrete topology. 

30. Each X(S) is Hausdorff as the product of Hausdorff spaces. The space 
(X i¢ 5Ki) is compact by the Tychonoff Product Theorem, and then X(S) is the 
product of finitely many locally compact spaces, which is locally compact. The 
Hausdorff property is handled by Problem 29b, and the final assertion is clear from 
the definition. 
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1. If K is compact in U, then K is compact in V, and hence the inclusion of 
C& into CZ,(V) is continuous. By Proposition 4.29 the inclusion of Cg¢,,(U) into 
CSm(V) is continuous. 

2. Fix K compact large enough to contain support(g). Then the map w +> y¢q is 
continuous from C°(U) into Cz. The inclusion of Cx into Cg, (U) is continuous, 
and hence w + wg, being a composition of continuous functions, is continuous 
from C°(U) into Cx, (U). 

3. Let {K;} be an exhausting sequence of compact subsets of U, and choose 
wy € CS (U) that is 1 on K; and is 0 off K;,1. For each j, the product (|, —ovy; 
is in C&,(U) with support contained in the open set U M (support(Ty))°. Therefore 
Tu (|, — 9\) Wj) = 0 for each j. The functions Cn — 91); tend to \u — gin 
the topology of C°(U), and therefore Ty gy —g ) = 0. Hence Ty (¢|y) = Ty (1) 


as required. 
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4. An adjustment is needed to the proof of Theorem 5.1. The proof in the text 
in effect used the expressions || f|| «4 = SUP, ex |(D°f)()| as seminorms together 
describing the relative topology of C& as a subspace of C® (IR”). To modify the proof 
of the theorem, we need to see that the same relative topology results from using the 
expressions || f Il x anew = Il(D°A II DK)" In one direction we have ||(D“f)]|| Rk © 
C sup,<x |(D°f)(x)|, the constant C being the L? norm of the function 1 on K’. In 
the reverse direction we apply Sobolev’s inequality (Theorem 3.11) with U equal to 
the interior of K’. This open set satisfies the cone condition. Sobolev’s inequality 
shows for k > N/2 that sup,<x, |(D°f)(®)| < C(O ypc [Ke 2deasae @) ean bee We 
follow the lines of the proof of Theorem 5.1, using these new seminorms and using 
linear functionals on spaces of L? functions instead of spaces of continuous functions, 


and the desired result follows. 


5. For (a), we write (T,g) = °, Ves D* g dpa(x) by means of Theorem 5.1. 
Substitution and use of Lemma 5.6 gives 


(T, F) = Ya aw DE Sx PO, y) dU(y) dpa (x) 
= Ya Sev fe DEP, y) duly) dpa (x). 


On the other hand, fe, ®(-, y))duQy) = Te yes ton DE ®(x, y) dpu(x) du(y), 
and the two expressions are equal by Fubini’s Theorem. 

For (b), choose a compact subset L of R such that L x K contains support(®), 
and choose n in C2. (IR™) that is identically 1 on L. Part (a) shows that 


com 


(nS, F) = f,(nS, ®(-, y)) du). 


On the other hand, we have (nS, F) = (S,nF) = (S, F), and (nS, ®(-, y)) = 
(S,n(-)®(-, y)) = (S, ®(-, y)), and the result follows. 

6. Fix a member 7 of C&,,(U) with values in [0, 1], so that nT is a member of 

CO 


E'(U). If g is a real-valued member of C3<,,(U), then for both choices of the sign +, 
NP llsup + gy) is amember of C$,,,(U) that is > 0. Hence (T, NP llsup +¢))>0, 


and (T,7)|lPllsup = (7. mll@llup) 2 F(T, 09) = F(nT,¢), ie. \(nT,¢)| < 
(T,n) IP llsup: For complex-valued g, such an estimate is valid for the real and 
imaginary parts separately, and we conclude that g +> (nT, @) is a bounded lin- 
ear functional on C&,.,(U) relative to the supremum norm. The Stone—Weierstrass 
Theorem shows that C&,,(U) is uniformly dense in the space Co(U) of continuous 
functions vanishing at “infinity” relative to U. In particular, Cg&,,(U) is uniformly 
dense in Coom(U), and g + (nT, @) extends to a continuous linear functional on 
Ccoom(U) relative to the supremum norm. Using the continuity of this linear functional 
and the denseness of C&,.,(U), we check that the extension of the linear functional to 
Ccom(U) is a positive linear functional. By the Riesz Representation Theorem it is 
given by a Borel measure j2,. The boundedness of the linear functional forces j1,(U) 
to be finite. 
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Let {K,} be an exhausting sequence. Define 1, to be a member of Cf>,,(U) with 


values in [0, 1] that is 1 on K2, and is O off KS 4D and form the corresponding 
Borel measures f4,. Then the sequence {7,(x)} is nondecreasing for each x and 
has limit 1. The measures jz, have to be nondecreasing on each set, and we define 
LL(E) = lim, 2p(£) for each Borel set E. The nondecreasing limit of measures is a 
measure, with the complete additivity holding by monotone convergence. We show 
that (7, ¢) = fy, gd for every gy in C&,,(U). 

For any g in C&,,(U), as soon as po is large enough so that K2,, contains 
support(y), we have (n,T, y) = (T, ~) for p > po. Also, “,(E) remains constant 
for each Borel subset of K2, when p > po, and hence w(E) = w,(F) for such 


subsets. Thus (7, ¢) = (npT, 9) = fy Gdep = fy Y du, as asserted. 

7. Computation gives A(e~™?") = 47r2|x|2e- — 29 Ne~*!*!"_ What needs 
computing is fpw |x|-4-2) |x |2Pe-* lt!” dx for p = 1 and p = 0, and then one has 
to sort out the result. This integral equals Qy_1 dee r2P+le—2r* dp. For p = land 
p = 0, the integral is elementary. Alternatively, it can be converted into a value of 


the gamma function by the change of variables zr” +> s. In neither case does the 
value of Qy_ need to be computed. 

8. Part (a) follows from the chain rule and the boundedness of each derivative of 
n since (ne) (x) = e~*n® (e“!x). 

For (b), if g has compact support, then (1 — -)g has compact support away from 
{0}. Therefore (7, (1 — ne)y) = 0, and (T,g) = (T,(1 — ne)g) + (T, neg) = 
(T, nev). Since gy + (T,g) and yg +> (T,7n-@) are continuous and Co, (R%) is 
dense in C®(R¥), (T, v) = (T, neg) for all y in C®~ (RY). 

In (c), we apply (a) and obtain 


MT, neg) | < C Dio SUP iy ear LD (ne)()| = C Yio SUPjy<e [D*(ne9)(@)I 
< Chg LiLo SUP jxj<e [DE (ne) (x) (D' 9) (x)| 
SC" Vihg Why el supjz)<e [(D'9) (x) 
< CO" Vii e! SUP |p| <¢ (D'o)(x)]. 


When g(x) = W(x)x"t!, |D'g(x)| < ¢ Vy [DW x) ||x"+!|, and the supre- 
mum for |x| < ¢ is < c'e"+!~. Therefore 


I(T, g)| = (T, neg) | < c'C” Wiig eh Met = OC" (n + Le. 


The right side tends to 0 as ¢ decreases to 0, and thus (7, g) = 0. 

In (d), the Taylor expansion of a general gis g(x) = Se, ne (0)x* +y (x)x"t! 
with w in C®(R!). Applying (T, -) to both sides and using (c), we obtain (7, g) = 

to BE ONT, x*). 

9. The adjustments in the argument are to (a) and (c). For (a), the estimate is 
|(D*n-(x)| < Cae !@! and is again proved by the chain rule. Each differentiation 


426 Hints for Solutions of Problems 


introduces a factor of ¢~!. For (c), Taylor’s Theorem says that the remainder term in 
computing a smooth function g(x) about the point 0 is 


n+l ly ly pl n _artlo 
Rint *1 °° *N Jo (1 —s) sl (sx) ds, 
U+--+ly=nt+l1, 1 N 
all 1 >0 


hence is of the form 


U l 
Wi, saa’s iy (x) x} es gy 
i+--+ly=n+l1, 


all j>0 
Thus one works witha function y (x ar oo 2 with y smooth and with )~ jy anti. 
The argument for (c) in Problem 8 now can be used. 


10. As with Problem 9, the arguments for (a) and (c) in Problem 8 need adjustment, 
and this time we need to change (d) completely. For (a), we use the above function 
n for RY—“, and we define ne(x’, x”) = n(e~!x"). Then (a) causes no difficulties. 
For (c), we need a new form of Taylor’s Theorem. The point is to treat g(x’, x”) as 
a function of x” alone, form a Taylor expansion with remainder, and carry along x’ 
as a parameter. The result is that the remainder term is a sum of terms of the form 
w(x’, x”)M(x"), where w is in C°(R%) and M is a homogeneous monomial in the 
x" variables of total degree n + 1. Then (c) causes no difficulties and again gives 0. In 
(d), the main terms of the Taylor expansion are of the form cy D% g(x’, 0)(x”)”, where 
a is a multi-index that is nonzero only in the positions corresponding to x” and has 
total degree < n. We introduce a linear functional Ty on C~(R“) by the definition 
(Ty, W(x')) = Ca(T, W(x") (x")*). Then T, is continuous, and the expansion (T, g) = 
eg ios (D“9)|p:) has the required form. 

11. Subtracting two tempered distributions solving Au = f , we obtain a tempered 
distribution u with Au = 0. From F(D@u) = (2mi)''&*F(u) and F(Au) = 0, we 
obtain |& |? F(u) = 0. It follows that F(u) is supported at {0}. Problem 9 then shows 
that F(u) is a finite sum of the form ), CyD*5. Taking the inverse Fourier transform 
of both sides, we see that the distribution u equals a polynomial function. 


12. Apply Theorem 5.1 to a member S of €/((—2z, 277)), writing it as a sum 
of finitely many derivatives of complex Borel measures p, of compact support: 
(S,g) = Delal<m fe Dg dp,, where K is a compact subcube of (—2z, 2x)”. For 
g(x) = e***, we have SUP, eK |D%(e**)| < |k@|, and therefore |(S,e—**)| < 
Diat<m elPell < C+ |k|?)"/?, where C = So gi<m ll Pall. 

13. Change notation and suppose that |c,| < C(1 + |k|?)” for all k. The series 


ik-x 
f@) = = oes is then absolutely uniformly convergent, and f(x) is 
k 
continuous periodic. Define S’ € €'((—2z, 27)") by 


(S',g) = 20) fw Foe) de. 
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Let D = 1 — A, and define § = D"+t+!58". Then 


(S, goes = (S', Drintleheyy = ( a [k[7) "4441 (97 gs 
= (1+ eye t Oa Sewn $e dx 


2\m+N+1 : 
= (1+ ki)” neat = Ck 


as required. 
14. For each g, the set of w with |B(g, W)| < lll ary is the set where the 
k 
continuous function |B(@, - )| is < some constant, and this is closed. The set Ex, is 
the intersection of such sets and is therefore closed. For each yw, the function B(- , w) 
is linear and continuous, and therefore there exists an integer k and a constant M for 
which |B(g, &)| < Ml¢ll 2 (pw) for all yg. This proves (a). 
k 


Since C®(T) is complete, the Baire Category Theorem shows that some Ex. y 
has nonempty interior, hence contains a basic open set, i.e., some set of the form 
U ={y’ | ly’ — oll pa¢rw) < ¢€}. If w has Iv ll pecpyy <,then w+ w isin U and 
thus has |B(g, W + w)| < Ml9ll 2 (pw) for all g in C©(T"). Then 

k 
[B(Q. WI < BY. vo + WI + IBY, Wo) < MIP ,2 7», + Cvo.koylPll,2 era 
k k(Wo) 


The right side is < M’||9|l for ky = max{k, k(wWo)} and M’ = M + Cy, key): 


= 
an l¥lany S$ MEM 2 cen l¥ la crny 


Li, (7) 
Hence |B(y, 1)| < M’e'llg| 
where kz = max{k,, s}. 

15. We apply the inequality of Problem 14b to D*g and D’ y, and then the result 
follows by applying the norm inequality of Problem 27 in Chapter III to || D*¢|| 


and || D? | 


LR(T®) 
Ly(r®) 

16. The functions e!/*e!”” are orthonormal in L7(T" x T'), and it is therefore 
enough to show that the sum of the absolute-value squared of the coefficients is finite. 
That is, we are to show that 


3 [Dim |712%m2P 
pmee” (Daicel™):( LB <k m6)” 


<0 
whenever |a| < k’ and |A| < k’. Since 1?% < Diese 22% and m?P < Diese m2’, 
it is enough to prove that 


am Dim |? 
l,meZN ( Vla'l<k! ve) ( Bick m6’) 


<O. (*) 
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If we use the estimate of Problem 15 for g =e!) and w = e!”“), we have 


2a, 2, 2 il-(- im-(+)\12 24) ,11-(-) 2 im-(-) 2 
Pm”? bim|” = |B(D%eFO, DPM OVP < CAEP O Mas gull Ilia epyy 
ki kl 


for |a| < K and || < K. Hence 
12m? |bim |? < Cl > [20 > mF), 
|a’| sk’ IB’ sk’ 
Summing over a and 6 for |a| < K and |8| < K and taking into account Problem 29 
in Chapter III, we obtain 
ict 2) + |m|2)¥ [Bun | < c'( x Baal > m??’) 
jo"|<k’ IB'\sk’ 
for aconstant C’. Thus the left side of (*) is < C’ 7) megy (I+ [|2)-k 1 +|m/?)-*, 
and Problem 32 of Chapter III shows that this is finite. 


17. Since Fy,g is in L?(T% x T™), B’ is a continuous function of two L?(T) 
variables D%y and D? yy. In particular it is well defined for g and yw in C®(T'). 
Because of continuity in L* and orthogonality, we have 


ony f Fup x, y)D%el!* DB eit dx dy 
[2,0 ]2" 


Dy (—i) 141812 mB i148] ¢ mB 


= (20)-?% i dxd 
(27) cma mF) xdy 
Ja’|<k’ |B’|sk’ 
_ Diy l2%m2P 
(5 2) a) 
Jo’|<k’ |B \sk’ 


Summing for a and B with |a| < k’ and |B| < k’, we obtain B’(e"), e'™) = 
Bell), em), 

18. The result of Problem 17 implies that B’(g, ¥) = B(g, w) if yg and w are 
trigonometric polynomials. It shows also that convergence in L~ of either variable 
and its derivatives through order k’ implies convergence of B’. Since convergence 
in C~(T™) implies convergence in L2,(T) and since B is separately continuous, 
B’ = BonC®(T%). The expression on the right side of the display in the statement 
of Problem 17 is the action of a distribution on T™ x T™ upon the function g @ yy, 
and thus B(g, w) = (S, g ® y) for a suitable distribution S. 


19. By the Schwarz inequality, |B(f, g)| < ]H(@P)lloiinglls = Inf llallng lle 
IF llolslle < WF llsupll$ llsup- This proves (a). 

For (b), we argue by contradiction. Using continuous functions f and g with 
disjoint supports, we see near (0, 0) that we must have do(x, y) = + an . However, 
the function = is not locally integrable, and there can be no such signed measure p. 
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1. For (a), let C be the connected component of 1. Since multiplication is 
continuous, it carries the connected set C x C to aconnected set containing |, hence to 
a subset of C. Thus C is closed under products. Similarly it is closed under inverses. 
It is topologically closed since the closure of a connected set is connected. If x is in 
G, then the map x +> gxg7! is continuous and therefore carries the connected set C 
to a connected set containing 1, hence to a subset of C. Thus gCg~! € C for all g, 
and C is normal. For (b), one can take the additive rationals or the countable product 
of two-element groups; for each the identity component contains only the identity 
element. 


2. In (a), if g fixes the first standard basis vector, then the first column of g is the first 
standard basis vector. Since g is arotation, gg = 1. In particular bee (g")ijgj1 = Si. 
Thus (g");; = 6;; for all i, and gj; = 6;,. In other words, the first row of g is 0 
except in the first entry. 

In (b), let v be any unit vector in RY” , and extend v to a basis v, v2,..., vy. The 
Gram-—Schmidt orthogonalization process replaces this basis by an orthonormal basis 
such that the first member is still v. We form a matrix with this orthonormal basis 
as its columns. If it has determinant —1, we multiply the last column by —1, and 
then the determinant will be 1. The resulting matrix is in SO(N) and carries the first 
standard basis vector to v. 

For (c), we obtain a continuous function SO(N) — S%—! given by g & gey, 
where e, is the first standard basis vector. This function descends to a function 
SO(N)/SO(N — 1) — S%~! that is necessarily continuous. It is one-one onto, its 
domain is compact, and the image is Hausdorff. Hence it is a homeomorphism. 


3. What needs to be shown is that every sufficiently small open neighborhood 
N of 1 - H in G/H is mapped to an open set by z. Since G/H is locally compact 
and has a countable base, there exist open neighborhoods U; of 1 - H such that ue 
is compact, ue C Ugsi, and G/H = U; U;. The Baire Category Theorem for X 
shows that z(U,,) has nonempty interior V for some n. Let y be a member of G such 
that z(yH) is in V, and put U = a My TV): Then U is an open neighborhood of 
1-H in G/H and x(U) = y~'V is open in X. Also, U“ is compact as a closed 
subset of ihe Let N be any open neighborhood of | - H in G/H that is contained in 
U. Since U‘! is compact, 2 is ahomeomorphism from U‘! with the relative topology 
to z(U“) with the relative topology. Thus z:(N) is relatively open in 2(U‘'). Hence 
a(N) = 2(U“') N W for some open set W in X. Since m(N) C m(U), we can 
intersect both sides with (U) and get m(N) = a(U") NWO a(U) = WOU). 
Since WM 2 (U) is open in X, z(N) is open in X. 

4. This is a special case of the previous problem. 


5. No. The reason is that the subset R! p cannot be locally compact. In fact, if it 
were locally compact, then it would be open in its closure, by Problem 4 in Chapter X 
of Basic. Since T? is a group and R' p is a subgroup, (R! p)"! is a group, and R! p 
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would be an open dense subgroup. An open subgroup is closed, and hence R! p would 
be equal to (R! p)"', i.e., R! p would have to be closed in T?. Then R! 2 {(e’’, 1)} 
would be a closed subgroup of the circle group {(e!’, 1)} and would have to be a finite 
subgroup or the entire circle. On the other hand, we readily check that R! pn{(e"”, 1)} 
is countably infinite. It therefore cannot be closed. 

6. Take V to be any bounded open neighborhood of 1. Inductively for n > 1, 
choose g, such that g, ¢ eae gxV. Then choose an open neighborhood U of 1 
with U = U-! and UU C V. Letus see that g,5U 1g,U = Sifk <n. If g isin 
g.Ug,U, then ggu = g,u’ with u and wu’ in U, and hence g, is in ggUU! CV. 
This contradicts the defining property of g,. Thus the sets g,U are disjoint. The left 
Haar measure of their union therefore equals the sum of their left Haar measures, 
and their left Haar measures are equal to some positive number, U being a nonempty 
open set. Consequently the left Haar measure of G is infinite. 


7. For (a), we have 


ME)p(G) = fe fg Le) dAQ) do(x) = Jo Sg Lexy) day) dp (x) 
= fg Sg ley) doe) dry) = fg Sg Le) do (x) da(y) 
= i(G)p(E). 


Therefore A(E)0(G) = A(G)p(E) as asserted. 

For (b), let A; and A2 be two left Haar measures. Without loss of generality, we 
may assume that 4;(G) = A2(G) = 1. Let e be aright Haar measure (existence by 
Theorem 12.1). Applying (a) twice, we obtain A; (EF) 0(G) = A1(G) p(E) = p(E) = 
A2(G) P(E) = A2(E)e(G), and hence 4;(E£) = A2(E) on Baire sets. Consequently 
41 = Az as regular Borel measures. 

8. In (a), both are Haar measures on G“ of total measure one. Parts (b) and (c) 
are special cases of Problems 15-19 of Chapter XI of Basic. 

9. For fixed g in G, we have d;(®(gx)) = d)(®(g)®(x)) = d;(®(x)), and hence 
d;(®(-)) and d;(-) are left Haar measures on G. The uniqueness in Theorem 6.8 
shows that they are multiples of one another. 

10. Under left translation we have (so, to)(s,t) = (sos)((s~!tos)t). If g is left 
translation by (so, fo), then (ds dt),-1 = d(sos) d((s—!tos)t) = ds dt, and ds dt is 
aleft Haar measure. Under right translation we have (s, t)(5o, t0) = (sso) (sq '+50)to). 
Thus dsdt goes to d(sso)d((sp't8o)to) = dsd(sg'tso) = 8(sg')dsdt, and 
6(s)ds dt goes to 5(550)5(sp |) ds dt = 6(s)dsdt. In other words, 5(s)dsdt is 
a right Haar measure. 

11. In (a), we have fi, f(c7!x) dx = Jy f@) dcx) = lely fy f(x) dx for f in 
Coom(V). Hence |c1c2|yv te f(x)dx = fs f ((c1e2) 1x) dx = Sy fc!) d(cix) = 
leilv fy (C; w)as = lcilvlealy fy f(x) dx. Choosing f with fy, f(x) dx # 0, 
we obtain |c1c2|y = |c1|vlc2ly when cy 4 0 andc2 4 0. The equality is trivial when 
one or both of c; and cz are 0, and hence we have |c,c2|y = |ci|v|c2|v in all cases. 
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To prove continuity, we first check continuity ateach co 4 0. Let S = support(/), 
and let N be a compact neighborhood of co not containing 0. If c is in N, then 
f (c~!x) is nonzero only for x in the compact set NS. Let € > 0 be given. Continuity 
of (c,x) bh f (c7!x) allows us to find, for each x in NS, an open subneighborhood 
N,, of co and an open neighborhood U, of x such that | f (c! y)— flco DI < € for 
c € N, and y € U,. Then | f(c7!y) — f(y) < 2e€ forc € N, andy e€ U,. 
The open sets U, cover NS. Forming a finite subcover and intersecting the cor- 
responding finitely many sets N,, we obtain an open neighborhood N’ of co such 
that | f(c~ly) — f(y) < 2e for c € N’ whenever y is in NS. As a result, 
a ame Or f(c-'x) dx is continuous at c = cy. Therefore c + |cly Jy f@) dx is 
continuous at co, and so isc b> |cly. 

To prove continuity at c = 0, we are to show that lim,_,9 i f(c-'x) dx = 0. 
Let U be any compact neighborhood of 0 in V. Find a sufficiently small neigh- 
borhood N of 0 in V such that c € V implies that csupport(f) does not meet 
U*. Then c~!U* A support(f) = @. For such c’s, we have Ke f(c7'x) dx| = 
| fu f(c'x) dx| < || F \lsup (dx (U)). The desired limit relation follows. 

Finally, even without the continuity at c = 0, these properties imply that |c|y = 
|c|' for some real t. The continuity at c = 0 forces t > 0. Then it follows that 
lcily S Iealv if lei] < Ica. 

In (6), V/W is itself a locally compact topological vector space, and its group 
operation is addition. With the normalization of Haar measures as in Theorem 6.18, 
iL becomes a Haar measure on V/W, and we write itasd(v+-W). Then fo f(v)dv= 
Surw (fy f(v+w) dw) d(v+W). If we replace f by f (c~!-) and move the c into 
the measures, we obtain fi, f(v)d(cv) = ieee (fi, fv + w) d(cw)) d(c(v + W)) 
and therefore |cly fi, f(v)dv = |cly;w Surw (Iclw fy fv + w) dw) d(v + W). 
Hence |cly = |cly;wlclw. 

In(c), choose N such that |2|y < 2”. If V has an N-dimensional subspace W, then 
Proposition 4.5 and Corollary 4.6 show that this subspace is closed and is Euclidean. 
Therefore |2|w = 2". Then (b) shows that 2lvw = |2lv/l2lw = 2% \2|y <1. 
But this conclusion contradicts the fact that |c|y;w = 1 if |c| => 1. We conclude that 
dimV <N. 


12. By inspection, (€,,, &.,) = (v2, v1) has the properties of an inner product. 
The Banach-space norm of €, is SUP y<1 [€,(v’)| = SUPyyy<1 |(v’, v)|. This is < 
|u|] = ||£y|| by the Schwarz inequality, and it is > ||v|| = ||2,|| because we can 
choose v’ = v/|lv]]. 

The contragredient has (®°(x)é,)(v') = €)(®(x7!)u’) = (®(x7!)v’, v) 
(v', B(x)v) = Lowy(v’). Hence O°(x)l, = Loy, and (O° (x)£,, P(x) EL) = 
(P(x)vu', B(x)v) = (v’, v) = (ly, Ev). 

13. Taking the adjoint of E®(g) = ®'(g)E gives ®(g)*E* = E*®'(g)* for all g. 
Since ® is unitary, ®(g)~! E* = E*®'(g)~! for all g, and thus ®(g)E* = E*®'(g). 
Then E*E®(g) = E*O'(g)E = O(g)E*E. By Schur’s Lemma, E* FE is scalar, say 
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equal to cl. Since E is invertible, c is not zero. If v 4 0, then clu? = (cI(v), v) = 
(E* E(v), v) = (E(v), E(v)) = 0. Soc > 0. If c denotes the positive square root 
of c, then F = (./c)~!E exhibits © and ®’ as equivalent, and F is unitary because 
F*F = (fe) E*E =c7!cIl =1. 


14. The operator ®(~), for p in O(N), makes sense on all of L? (RY), as well as 
on the vector space H;,. It was observed in the example toward the end of Section 8 
that the Fourier transform F commutes with the action by members of O(N). Thus 
we have F(®(p)(hj(x) f(Ix|))) = ®(p)F(hj(x) f (\x|)). The left side at y equals 
the expression >; O(p), Fi) f(aIIO) = Xj, OW) Lys) filly) = 
d%, (Oi i(y|))AsQ), and the right side is 6(p)(>°, A: (y) fly) 
= YY, PM shsO) fly) = Ly (LL, Ose fly) )as(y). The equality of 
the two sides gives us, for each | y|, the matrix equality asserted in (a). 

Corollary 6.27, the formula of part (a), and the irreducibility of H, together imply 
that F(|y|) is a scalar matrix for each |y|. In other words, fj;(ly|) = g(lyl)6i; 
for some scalar-valued function g. Then F(hj(x) f(x) = 1 AOA Uy) = 
AiO) 8(y)5i; = 4; g(yl|) for all 7. Since A is a linear combination of the 
hj’s, Fih(x) f (x))(y) = h(y)g (yl). This proves (b). 

For (c), we observe that F(|y|) is continuous if (|x|) is continuous of compact 
support. In fact, the inner product on H; can be taken to be integration with dw over 
the unit sphere S¥~!. By homogeneity this is the same as the inner product relative 
to r~>* dw over the sphere of radius r centered at 0. Then the formula for f; jis 


fii) = Sons Fax) f (x) rayhi For do 
= fon Flhj(x) f (x) (royhi(@yr~* dw 


forr > O,and this is continuous inr since F(h;(x) f (|x|)) is continuous on R’. Thus 
the vector subspace of all f in L?((0, 00), r¥~!+** dr) for which F(h(x) f (|x|) 
is of the form h(y)g(|y|) contains the dense subspace Ceom((0, 00)). Let f™ in 
Ceom((0, 00)) tend to f in L?((0, 00), r¥~!+** dr). Then h(x) f (|x|) tends to 
h(x) f (x|) in L7(R%), and F(h(x) f™ (\x|)) tends to F(h(x) f (|x|) in norm. A 
subsequence therefore converges almost everywhere. Since F(h(x) f™ (|x|))(y) 
h(y)g (|y|) almost everywhere, the limit function must be of the form h(y)g({y|) 
almost everywhere. 


15. If {v;} is an orthonormal basis of V,, then {¢,,} is an orthonormal basis of V*, 
and (O° (x)ly,, Cu,) = Coy; €,) = (vj, P(x) vj) = (O(x)v;, vj). Summing on j 
gives the desired equality of group characters. 

16. Following the notation of that example, let t;;(x) = (t(x)uj;,u;), let / 


be the left-regular representation, and let £, be as in Problem 12. Consider, for 
fixed jo, the image of t°(g)@,, under the linear extension of the map E’(€,,)(x) = 


(T(x), Ma). This is BCS oy, )@) = E'( Dy elu) = Dy CE’ Cue) = 
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dy Ce(T(X) jg, Uk) = (T(X)Ujy, D>, CeUK), and hence E’(€,)(x) = (tT(x)uj), Vv). 
Then the image of interest is 


E'(1°(g)lu)(4) = E’(lr(gyu, (Xt) = (T(x)u jy, T(g)us) 
= (t(g7!x)ujy, ui) = U(g) tip) (*)- 


Hence / carries a column of matrix coefficients to itself and is equivalent on such a 
column to T°. 

17. In (a), the left-regular representation on G = R/27Z is given by (/ (0) f)(e!?) 
= f(e'’-”), Assuming on the contrary that / is continuous in the operator norm 
topology, choose 5 > 0 such that |@| < 6 implies ||/(@) — 1|| < 1. Since Ile'”? ||, = 1, 
we must have ||/(9)(e'"”) — e'"? ||, < 1 for || < 5. Then 

|e? = 1|? = x ie |e—in8 en 1|? dy = + ies |ein(e—9) _ eine |? dy <1 
for all 6 with |0| < 6 and for all n. For large N,@ = xy satisfies the condition on 0, 
andn = N has |e~”° — 1|? = | —i — 1|? = 2, contradiction. 

In (b), || ®(g)v — v||? = (@(g)v —v, B(g)v— v) = || ®(g) ||? —2Re((g)v, v) + 
I|v ||? = 2||v|]7 —2 Re(®(g)v, v). The weak continuity shows that the right side tends 
to 0 as g tends to 1, and hence the left side tends to 0, i.e., ® is strongly continuous. 

18. In (a), we apply Problem 15. Let {u;} be an orthonormal basis of the space 
of ®. In the formula (®(f)u,, uz) = he (P(x)uz, ug) f (x) dx, we take f to be of 
the form f(x) = (®(x)u;, u;). Substituting and using Schur orthogonality gives 
(®(f ug, up) = d~' (uy, uj)(Ug, uj). Summing on k shows that Tr ®(f) = d~'8;;, 
and the right side is d~! f (1) for this f. Thus f(1) = d®(f). Passing to a linear 
combination of such f’s, we obtain the asserted formula. 

Part (b) follows by taking linear combinations of results from (a), and part (c) 
follows by applying (b) to a function f* * f, where f*(x) = f(x7!). Part (d) 
follows by decomposing the right-regular representation on L*(G) into irreducible 
representations and using the identification in Section 8 of the isotypic subspaces. 

19. For (a), h* f(x) = fgh@y f(y) dy = fg hon fO) dy = f *h(x). 

For (b), it is enough to check the assertion for f equal to a matrix coefficient 
x +> (®(x)uj,u;) = &;;(x) of an irreducible unitary representation ®. If ® has 
degree d, then we have 


Se Fexg”')dg = fg Pis(gxg”')dg = Vy) Sg Pik(g) Pale) Pij(g~') dg 
= Yep Pu) fg Pig) P(g) dg = Vy) Par@)d"8;j5u1 = 5:47" Vy, Pea), 


as required. 

In (c), Corollary 6.33 shows that h is the uniform limit of a net of trigonometric 
polynomials. Since C(G) is metrizable, h is the uniform limit of a sequence of 
trigonometric polynomials h,. If € > 0 is given, we can find N such thatn > N 
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implies |h,(y) — A(y)| < € for all y. Then |h,(gxg~!) — h(gxg7!)| < € and so 
| iq hn(gxg!) dg — f, h(gxg—') dg| < €. The function H,(x) = fi, hn(gxg~!) dg 
is a linear combination of irreducible characters by (b), and /, G h(gxg—!) dg is just 
h. Thus h is the uniform limit of the sequence of functions H,,, each of which is a 
linear combination of characters. 

In (d), it is enough to prove that the space of linear combinations of irreducible 
characters is dense in the vector subspace of L* in question. If h is in this sub- 
space, choose a sequence of functions h,, in C(G) converging to h in L*. Then 
A, (x) = ifs hn(gxg—!) dg converges to h in L?, and each H,, is continuous and 
has the invariance property that H,(gxg~!) = H,(x). Hence the vector subspace 
of members of C(G) with this invariance property is L* dense in the subspace of 
L? in question. By (c), any member of C(G) with the invariance property is the 
uniform limit of a sequence of functions, each of which is a finite linear combination 
of characters. Since uniform convergence implies L” convergence ona space of finite 
measure, the space of linear combinations of irreducible characters is L* dense in the 
space in question. 


20. In(a), the sum }°, (d («))2 counts the number of elements in the basis of L?(G) 
in Corollary 6.32. Another basis consists of the indicator functions of one-element 
subsets of G, and the two bases must have the same number of elements. 

In (b), again we have two ways of computing a dimension, one from (d) in the 
previous problem, and the other from indicator functions of single conjugacy classes. 
The two computations must give the same result. 

In (c), representatives of the possible cycle structures are (1234), (123), (12), 
(12)(34), (1). By (b), the number of 6s is 5. Two of these have degree 1. For the 
other three the sums of the squares of the degrees must equal 24 — 1 — | = 22. The 
only possibility is 22 = 9 + 9 + 4, and thus the degrees are 1, 1, 2, 3, 3. 


21. Let Q C Gbe the set of products ST ,and let K = SNT. The group S x T acts 
continuously on Q by (s, t)@ = swt~!, and the isotropy subgroup at | is the closed 
subgroup diag K. Thus the map (s, t) +> st~! descends to a map of (S x T)/diag K 
onto Q. Since Q is assumed open in G, it is locally compact Hausdorff in the relative 
topology. Then Problem 3 shows that the map of (S x T)/diag K onto Q is open, 
and it follows by taking compositions that the multiplication map of S x T to Q is 
open. 

22. Inthe two parts, AN and M AN are subgroups closed under limits of sequences, 


hence are closed subgroups. Consider the decompositions in (a) and (b). For the 


ab 


decomposition in (a), we multiply out the relation kga,ny = e a and solve for 6, 


x, and y, obtaining 


e = Va? 4+c?, cos6 =e “a, sind = e“c, y =e (ab + cd). 


Hence we have the required unique decomposition. Since K AN equals all of G, the 
image under multiplication of K x AN is open in G. For the decomposition in (b), we 


Chapter VI 435 


multiply out the relation v,;mza,ny = ee a and solve for t, m+, x, and y, obtaining 


+ =sgna, or = al, y=D)/a, t=c/a. 


Hence we have the required unique decomposition if a 4 0, and the decomposition 
fails if a = 0. Since VMAN equals the open subset of G where the upper left entry 
is nonzero, the image under multiplication of V x MAN is open in G. 

The group G = SZ(2, R) is unimodular, being generated by commutators, and 
hence the formula in Theorem 12.9 simplifies to [, f(x) dx = fy, f (st) djs d,t. 
For (a), we apply this formula with S = K and T = AN. The group K is unimodular, 
so that d;s becomes dé, and we easily compute that d,.t can be taken to be e2* dy dx. 
For (b), we apply the formula with § = V and T = MAN. The group V is 
unimodular, and we find that the right Haar measure for MAN can be taken to be 
e** dy dx on the m, part and the same thing on the m_ part. 

25. If h is in C((0, 7r]), the previous two problems produce a unique f = f;, in 
C(G) such that f;, is constant on conjugacy classes and has h(@) = f(t). Define 
L(h) = te Ffn(x) dx. This is a positive linear functional on C([0, 2]) and yields 
the measure jz, by the Riesz Representation Theorem. If f is any member of C(G) 
and fo(x) = J, f(gxg—')dg, then f, f(x) dx = J fo(x) dx and fo is fi, for the 
functionh(@) = fo(tg). The construction of js makes Stor] fo(te)du = fi, fo(x) dx. 
Substitution gives fio.) [Sg f(gteg')dg|du = fg folx)dx = fg f(x) dx. 

26. The crux of the matter is (a). The formula of Problem 25, together with the 
character formula for x,,, gives 


8n0 = fig XaXq FX = fio. CP? + lO? + --- +e") dy). 


This says that Stor] du(6) = 1 forn = 0, Sion (e!? + e) du(6) = 0 forn = 1, 
and fig. (e7? + 1 + e~ 7°) du(@) = 0 for n = 2. The middle term of the integrand 
for n = 2 has already been shown to produce 1, and thus the n = 2 result may be 
rewritten as Sion (e7/? +.e-2!°) diw(@) = —1. Forn > 3,comparison of the displayed 
formula for n with what it is for n — 2 gives 0 = Sion] (ei? +e”) du(O) + bn-2.0- 
Since n — 2 > 0, we obtain Stor] (e'”? + e—") du(6) = 0 forn > 2. 

For the rest we replace 6 by —@ in our integrals and see that the integral 
Sno] (e'”? + e-!"®) du(—@) is 0 forn = 1 and n > 3, and is —1 forn = 2. 
Therefore Seen] (e'”? + e!"®) du/(6) is 0 forn = 1 andn > 3, and is —1 
for n = 2. We can regard yw’ as a periodic Stieltjes measure whose Fourier se- 
ries may be written in terms of cosines and sines. Since u/(E) = yw'(—E), only 
the cosine terms contribute. There are no point masses since only finitely many 
Fourier coefficients are nonzero. Since cos 20 has a cosine series with no other 
cos k@ contributing, rset cosné du! (8) = — 55n,2 =-— + Sapa cos né cos 26 dé 
for alln > 0. Taking into account that w’([—7z,72]) = 1, we conclude from 
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the Fourier coefficients that du’'(0) = (1 — cos20)d0 = 4sin’ dO. Since 
fei Qas= tn x] te f (gtog—') dg dy’ (6), substitution into the formula of Prob- 
lem 25 gives the desired result. 


27. Problem 19d shows that the irreducible characters give an orthonormal basis for 
the subspace of L* functions on $U(2) invariant under conjugation. In view of Prob- 
lem 26d, the restrictions of these characters to the diagonal subgroup T therefore form 
an orthonormal basis of the subspace of all functions x in L?([-2, wr], + sin’ 6 dé) 
with x(0) = x(—6@). Since sin? @ = zle’® —e'°|?| the conditions to have a new x 
are that it be a continuous function with x (9) = x (—@) such that 


ise (e!? an e!9) x (A) (eft 8 =— ein+l0y =0 


for every integer n > 0. Using the condition x (0) = x (—0), we can write the Fourier 
series of x as x(0) ~ a + eran a, cosk@. For n > 1, the orthogonality condition 
says that fo x(0)(cos(n + 2)80 — cosn@) dé = 0. Hence a,+2 = a, forn > 1. By 
the Riemann—Lebesgue Lemma, all a, are 0 for n > 1. Thus x is constant. Since 
Xo = 1 is already a known character, x = 0. 


28. Let F be a compact topological field. If F is discrete, then each one-point 
set is open, and the compactness forces F to be finite. Otherwise, every point in F 
is a limit point. Take a net {x,} in F — {0} with limit 0, and form the net ak By 
compactness this has a convergent subnet Ce with some limit x9. By continuity of 
multiplication, is Xe, } converges to 0x9 = 0. On the other hand, every term of the 
subnet is 1, and we conclude that a net that is constantly 1 is converging to 0. This 
behavior means that F is not Hausdorff, contradiction. 


29. In (a), the argument that c + |c|- is continuous and satisfies |cjco|- = 
lci1|F|C2|F is the same as in Problem I 1a. 

For (b), we have d(cx)/lex|e = (lle dx)/(clelxlr) = ax/|x|r. For (c), Ix lr = 
|x| if F = R, and |x|- = |x|? if F = C. For (d), |x|r = |x|p if F = Q,). For (e), 
we have J = pZ,, and therefore the Haar measure of / is the product of |p|, = yo 


times the Haar measure of Z,. Hence the Haar measure of J is pr 


30. In (a), the image of a multiplicative character must be a subgroup of S!, and the 
only subgroup of S! contained within a neighborhood of radius 1 about the identity 
is {1}. Thus as soon as n is large enough so that p"Z, is mapped into the unit “ball” 
about 1, p,Z, is mapped to 1. 

In (b), Q,/Zy is discrete since Z, is open. Hence the cosets of the members of Q 
exhaust Q,/Zp, and it is enough to define a multiplicative character of the additive 
group Q that is 1 on every member of QM Z,. Let a/b be in lowest terms with 
b > 0 and with |a/b|, = pt. If k < 0, then set go(a/b) = 1. If k => O, write 
b = b'p*. Since b’ and p* are relatively prime, we can choose integers c and d with 

a c 
b' p* ~ 
well defined because if c’ p* + b'd’ = a, then (c — c’) p* + (d —d')b’ = 0 shows that 


cp* +b'd = a, and then 


d 2nid/p* . 
a es We set go(a/b) = e*7'*/P . The result is 
P 
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d —d' is divisible by p* and hence that e27/4/P?" = ¢?'4'/P* | Qne has to check that 
go has the required properties. 

In (c), we may assume that ¢ is not trivial. The p-adic number k can be formed by 
an inductive construction. Use (a) to choose the smallest possible (i.e., most negative) 
integer n such that ¢ is trivial on p”Z,. Then x +> g(p"x) is trivial on Z, and must 
be a power of e77/? on p~!. We match this, adjust ¢, iterate the construction through 
powers of p~!, and prove convergence of the series obtained for k. 


31. Write r in Q as r = +m/n, assume without loss of generality that r 4 0, 
and factor m and n as products of powers of primes. Only finitely many primes can 
appear, and |r|, = 1 if p is prime but is not one of those primes. The only other v is 
oo, and thus |r|, = | except for finitely many v. 


32. Withr 4 0 and withr = +m/n in lowest terms, factor m and n into products 
=a; 
i 


: bj bj 
of primes as m = []/_, p“ andn = [ol qj’. Then |r|p, = pj’ and |r|q, = 4; - 


Hence 


k 1 
TT he =( Ter )(T] 4) =F! and [rly = lee loo = 1. 
i=l 


p prime geal veP 

33. The product of topological groups is a topological group, and thus each X (S) 
is a topological group. The defining properties of a group depend only on finitely 
many elements at a time, and these will all be in some X(S). Thus X acquires a 
group structure. The operations are continuous because again they can be considered 
in a suitable neighborhood of each point, and these points can be taken to be in some 
X(S) x X(S) in the case of multiplication, or in some X (S) in the case of inversion. 
Thus X is a topological group. The assertions about the situation with topological 
rings are handled similarly. 


35. By continuity of translations, it is enough to find an open neighborhood U of 0 
in Ag with U NQ = {0}. Since each Ag(S) is open in Ag, it is enough to find this U 
in some Ag(S). We do so for S$ = {oo}. Let U = (—1/2, 1/2) x (X , primeZo). If-x 
is in U, then |x|, < 1 for all primes p and |x|,.. < 1/2. By Problem 32, x cannot be 
in Q unless x = 0. Hence U MN Q = {0}. Proposition 6.3b shows that Q is therefore 
discrete. 


36. If x = (x,) is in Ag, let p},..., p, be the primes p where |x,|, > 1, and let 
IXplp, = Be Ifr = [jy pe” and if we regard r as embedded diagonally in Ag, 
then ear |p < 1 for every prime p. Hence xr! isin Ag({oo}). Choose an integer 
n such that |Xoor! — No < 1. If we then regard n as embedded diagonally in Ag, 
then |”|, < 1 for all primes p, and hence n is in Ag({oo}). Thus xr—! — nis in the 
compact subset K = [—1, 1] x ( x ty prime) of Ag. The continuous image of K in 
Ag/Q is compact, and we have just seen that this image is all of Ag/Q. Thus Ag/Q 
is compact. 
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37. Fix a finite subset S of P containing {00}. Then the projection of X ,,.;Q% 
to Q* is continuous for each v € S. Since also the inclusion Q* — Q, is 
continuous, the composition X ie sQ — Q, is continuous. Thus the corre- 
sponding mapping X ,,-sQ% > X,,<sQw is continuous. In similar fashion 
x ig pi ass — Z, is acontinuous function as a composition of continuous functions. 
Thus X ,, j an Ky d sw is continuous. Putting these two compositions together 
shows that Ao (S) — Ag(S) is continuous, and therefore Ao (S) + Agiscontinuous. 
Since this is true for each S, it follows that Ag — Ag is continuous. 

The topologies on the adeles Ag and the ideles Ao are regular and Hausdorff, and 
they are both separable. Hence Ag and Ao are metric spaces, and the distinction 
between the topologies can be detected by sequences. Let p, be the n™ prime, and 
let X» = (Xn,y) be the adele with x,» = py if v = p, and x, = Lifv ¥ p,. The 
result is a sequence {x,} of ideles, and we show that it converges to the idele (1) in 
the topology of the adeles but does not converge in the topology of ideles. In fact, 
each x, lies in Ag({oo}), which is an open set in Ag. For each prime p, x,,, = 1 
if n is large enough, and also Xp,.0 = 1 for alln. Since Ag({oo}) has the product 
topology, {x,} converges to (1). On the other hand, if {x,,} were to converge to some 
limit x in Ao, then x would have to lie in some Ag(S), and the ideles x, would have 
to be in Ag (S) for large n. But (x,,,) is not in Ag (S) as soon as v is outside S. 

39. In (a), let f be in C(K). Corollary 6.7 shows that the map k +> kf of K into 
the left translates of f is continuous into C(K ). The continuous image of a compact 
set is compact, and thus f is left almost periodic. Similarly f is right almost periodic. 

In (b), let g be inG. Then (gf)(x) = f(g7!x) = FU(g7!x)) = FU(g)'(x)) = 
((u(g) F)(u(x)) shows that the set of left translates of f can be regarded as a subset 
of the set of left translates of F. The latter is compact, and hence the closure of the 
former is compact. 

40. We may view the unitary representation ® as a continuous homomorphism 
of G into the compact group K = U(N) for some N. If f(x) = ®(x);;, then 
f(x) = F(®(x)), where F : U(N) — C is the (i, pe entry function. Thus 
Problem 39b applies. 

41. In (a), assume the contrary. Then for some € > 0 and for every neighborhood 
N of the identity, there exists gy in N with |lgvf — filloup 2 €- Here {gn f} is a net 
in the compact metric space Ky, and there must be a convergent subnet {gy, f} with 
limit some function h in Ky. Since ||gw, f — Allup tends to 0, h is not f. Thus gy, f 
converges uniformly to h while, by continuity, tending pointwise to f. Sinceh # f, 
we have arrived at a contradiction. 

Part (b) follows from the formula ||go(g1f) — 80(82 A) llup = ll8if — 82F Ilsup> 
and part (c) follows from (b), uniform continuity, and completeness of the compact 
set Ky. 

42. Part (a) follows from a remark with Ascoli’s Theorem when stated as Theorem 
2.56 of Basic: the remark says that if we have an equicontinuous sequence of functions 
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from a compact metric space into a compact metric space, then there is a uniformly 
convergent subsequence. Here if we have a sequence {¢,} of isometries of X onto 
itself, then the @, are equicontinuous with 6 = e. Since the domain X is compact 
and the image X is compact, the sequence has a uniformly convergent subsequence, 
and we readily check that the limit is an isometry. Since every sequence in I’ has a 
convergent subsequence, I" is compact. 

For (b), let members of I have g, — g and w, — yw. Then 


P(On 2 Wn, POW) < P(Qno Vas On OW) + PGnow, gow). 


The first term on the right side equals p(y, wv) because ¢g, is an isometry, and the 
second term equals p(¢,, ~) because w(x) describes all members of X as x varies 
through X. These two terms tend to 0 by assumption and hence g, 0 %} > gow. 
This proves continuity of multiplication. Similarly inversion is continuous. 

For (c), let y, — y and x, > x. Then 


A(Yn (Xn), VX) S dAYn Gn), Yn FAY On), YO)) S Pn VIF On), VX), 


and both terms on the right side tend to 0. 

43. In (a), let {g,} be a net convergent to gp in G, and form {(g,)}. Then 
P(L(Sn)sL(80)) =SUPpex, Ie(Sn)A—L(80)ALl sup =SUPheK;, xeG 14 (Sn) A(X) —U(go)A(x)| 
= SUuPpex,, reo Ih(,!x) — h(gg'x)| = supyeg, reg IOP (8x x) — O89 'X)| = 
SUP eG, xeG [fy e7!x) a f(y7!gg'x)|. If this does not tend to 0 as g,, tends to go, 
then we can find a subnet of {g,}, which we write without any change in notation, 
and some € > 0 such that this supremum is > ¢€ for every n. To each such n, we 
associate some y, such that sup,<g |fivz!ertx) a fOR 25 2) > €/2. By left 
almost periodicity we can find a subnet of {y, f} that converges uniformly to some 
function, say H. This function H has to be left uniformly continuous, and we may 
suppose that ||y, f — A|lsup < ¢€/8 forn > N.Thenn > N implies 
ln P)Bn'X) — On f (89 '*) 
<NOnf)(8q'8)—H (By) + 1H (By) —H(89 x) + 1H (892) — On (89'*)| 


< £4 |H(g7'x) — H(gp'x)| + £. 


The left uniform continuity of H implies that the right side is eventually < 3 This 
contradicts the condition sup,<¢g Ifivr! ert) = fvz'go'x)| > €/2, and (a) is 
proved. 

In (b), the action 'y x Ky — Kf is continuous by Problem 42c, and therefore 
y +» y~!his continuous. Evaluation of members of K yf at 1 is continuous, and hence 
Fy(h) is continuous on I’;. If {g,} is a net with g, f > h, then Fy(h)(i¢(g0)) = 
((¢¢(go)) 1a) (1) = Tiny ((t¢ (G0)! 8n FL) = lim (nf (80) = A(g0).- 

For (c), we apply (b) with h = f. Then f arises from the compact group ly via 
the construction in Problem 39b. Therefore f is right almost periodic. 
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44. If f is a given almost periodic function, the function F to use takes an element 
Ip (vp) to Fy (yf). Then the equality F(u(x)) = Fr(t¢(x)) = f(&) shows that f 
arises from the compact group I’. 

45. Problem 44 produces an isomorphism of the algebra LAP(G) of almost 
periodic functions on G onto C(I’), and the Stone Representation Theorem (Theorem 
4.15) produces an isomorphism of LAP(G) with C(S;), where S; is the Bohr com- 
pactification of G. The result then follows after applying Problem 23 in Chapter IV. 

46. Finite-dimensional unitary representations of I give rise to finite-dimensional 
unitary representations of G, and thus Corollary 6.33 for I’ gives the desired result. 


47. Any continuous multiplicative character of K yields a continuous multi- 
plicative character of G. Conversely any continuous multiplicative character of G 
is almost periodic by Problem 40 and therefore yields a continuous function on 
K. The multiplicative property of this continuous function on the dense set p(G), 
together with continuity of multiplication on K, implies that the function on K is a 
multiplicative character. 


Chapter VII 


1. If xo is in Q, let g be a compactly supported smooth function on Q equal to 
(x — xo)* in an open neighborhood V of xo. Then 0 = (P(x, D)u)(x) = (@!)ag (x) 
on V, and hence a,(x) = 0 for x in V. 

2. Within the Banach space C(Q", R), S is the vector subspace of all functions 
u with Lu = 0 on Q. It contains the constants and hence is not 0. The restriction 
mapping R : S > C(dQ, R) is one-one by the maximum principle (Theorem 7.12), 
and it has norm 1. Let V be the image of R, and let R-! : V — S be the inverse 
of R : S > V. The operator R~! has norm 1 as a consequence of the maximum 
principle. Ife, denotes evaluation at the point p of Q, thene,o R~! isa bounded linear 
functional on V of norm 1. The Hahn—Banach Theorem shows that e, o R~! extends 
to a linear functional € on C(0Q, IR) of norm 1. We know that ¢(1) = e, 0 R7'(1) = 
e,p(1) = 1. If f = O is a nonzero element in C(dQ, R), then 1 — F/I F ll sup has 
norm < 1. Therefore |@(1 — F/I F ll sup) | < land0 < ECF/IIF ll sup) < 2. Thus the 
linear functional £ is positive. By the Riesz Representation Theorem, ¢ is given by a 
measure j1,. Consequently every u is S has u(p) = Ys u(x) duy(x). Taking u = 1 
shows that (0&2) = | for every p. 


3. In (a), the line integral ¢, (P dx + Q dy) is equal to 


(x, y)|=e 
ie y(ecos 6, € sin@)e~*((e cos @)(—e sin) + (€ sin@)(e cos @)) dé, 


and the integrand is identically 0. Part (b) is just a computation of partial derivatives. 
If (c), we know from Green’s Theorem that for any positive numbers ¢ < R, 


(fener — Daze) (Pdx + Ody) = Mec@ayee (Geo 3) dx dy. 
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With our P and Q, for sufficiently large R, the line integral ¢, CHER is O since P and 


Q have compact support, and (a) says that the limit of the line integral $, ai isOase 

20 AP __ Yx-XHy 

dy ~  x24+y2 

conclude from the complete saditvity of the integral that feo (; Bae) dx dy =0. 
In (d), with a new P and Q, the line integral $x ee (P dx + Q dy) is equal to 


decreases to0. The function = is integrable near (0, 0), and we thus 


ie y(écosé@, €sin Oe? ((—e sin 8)(—eé sin) + (€cos 0)(€ cos 6)) do 


This simplifies to fs g(ecos@, €sin@) dé, which tends to 277 g(0, 0) by continuity 
of g. Part (e) is just a computation of partial derivatives, and part (f) is proved in the 
same way as part (c). 


For(g), wehave 2" 32 =<" (ps tigy) = Spyz (Ps tigy) = “BRP + SBS. 
Combining (c) and (f) gives ff. zl je dxdy = —2ng(0,0) + i0, and hence 


ob Lga 21 = -9(0.0) 

For (h), we use (g) and obtain (2 ,9) =-(T, se) = — ffpo(20z)- 1g gp dxdy = 
g(0, 0) = (6, 9). 

4. In(a), let y be inc (R!). Then (D,H, v) =—(H, 9’) y= fe H(x)g' (x) dx 
=— f° g(x) dx = —limy[9(x)]} = gO) = (3, 9). 

In (b) let g be in C&,,((-1, 1)). We are to verify that (6 , Max{x, O}y'(x) dx = 
aa H(x)o(x) dx, i.e., that fe xo'(x)dx = ~ g(x) dx. This follows from 
integration by parts because i xg (x) dx = [xe(x) 4 af y(x)dx = — Hs p(x) dx. 

The answer to (c) is no. If g were a weak derivative: then the left side of the equality 


Fie H(x)g'(x)dx = — - , &(x)p(x) dx would be 0 whenever gy € C,,((—1, 1)) 
panshes | in a neighborhood of 0. Then g(x) would have to be 0 almost everywhere 


for x ~ 0, and we would necessarily have 0 = Ae g'(x)dx = [y(x)]f = —g(0) for 
all g in C&,,((-1, 1)). 

In (d), (D, (A x ), yg) = —(A x 6, Dyg) = — fo” (Dx p(x, 0) dx, and this 
= —limy[y(x, R=) = 90, 0) = (5, 9). 

In (e), the support of A is [0, oo) and the singular support is {0}, while for H x 6 
the support and the singular support are both R x {0}. 


5. We apply Lemma 7.8 to R(x) = P(ix). The preliminary step in the proof 
multiplies the given distribution f by something so that f has support near 0. We 
form e~!** f as amember of €'((—2z, 277)) and restrict it toa member of P’(T'). 
Then it has a Fourier series e'%* f ~ >, dpe’**. Put cy = Ren a being the 
member of R® produced by the lemma. Then |c,| < C(1 + |k|?)? for some p, 
and (b) produces a distribution S in €’((—2z7, 27r)") with (S, ek*) = ¢ for all k. 
Define u = e“*S as a member of €’((—2z, 21)"). Let r(x) be a smooth function 
with compact support near 0, and extend 7 to be periodic, i.e., to be inC °(T'). The 
multiple Fourier series of w is then of the form w(x) = >>, yei* with y, decreasing 
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faster than any power of |k|. The function g(x) = w (x)e@* is in C® ((—27, 277)%) 
but is not necessarily periodic. Applying P(D) to u and having the result act on 9, 
we write 


(P(D)u, p) = (P(D)u, Di, ne"@—O*) = (P(D)u, Dy, y-ne O*). 


Since the are rapidly decreasing and P(D)u is continuous on C™®((—2z, 2n)%), 
we can interchange the summation and the operation of P(D)u. Thus the right side 
of the display is 


Dig Y-e(P(D)u, eV) = YE, yeu, P(—D)(e1OF*)) 
= Dy Vale", PUK + ae HO) = Vy alS, PUK + ae") 
= Ve y-ncnP GK + @)) = Vy Yk Ray Pik Lay= Sv ade: 


Now d; = (e7'** f, e-‘**). The rapid convergence of the series )>, y_xe~'** means 
that (e!* f, w) = Do, y-n(e'** f, e**) = Yo, y_ndy. Therefore (P(D)u, g) = 
yd = (01 f, W) = (01 f, ep) = (f, v). Near 0, the function ¢ is an 
arbitrary smooth function, and thus P(D)u = f near 0. 


6. The coefficient of x” in (x; + --- + xy)!@! is the multinomial coefficient 
( is ) = left This is a positive integer, and hence a! < |a|!. Fixing |a| = / and 
putting x} = --- = xy = 1, we obtain the formula N! = ida u, and therefore 
Sie (1/a!) = N'/I!. The identity with z can be proved by induction on q, the base 
case being g = 0, where the expansion is a geometric series. If the case g is known, 
we differentiate both sides and divide by g + | to obtain the case gq + 1. Alternatively, 
one can derive the identity from the binomial series expansion in Section I.7 of Basic. 


7. Here is the solution apart from some details. The argument uses induction, the 
base case being m = |, where the result describes the given system of differential 
equations. Assuming that D/”~ ' is of the asserted form, we differentiate the expression 
with respect to t. In the 2”~! terms of the first kind, the derivative acts on some 
expression Du, giving DY D,u. We substitute for D,u from the given system and 
sort out what happens; we get 2” terms involving an x derivative of u and 2””—! terms 
involving a derivative of F’. In the 2"-1 _ 1 terms of the second kind, the derivative 
acts on some iterated partial derivative of F and just raises the order of differentiation. 
The total number of terms involving F is then 2”~! + 2"-! —1 = 2" —1. 


8. In (a), just apply D& to the formula for Diu in the previous problem. The 
operator gets applied to each u or F that appears in the formula, and there is no 
simplification. Then one evaluates at (0,0). In (b), the asserted finiteness implies 
that the multiple power series 


Dé D" u(0,0) 
U(x, t)= Vp ee B\m! xPom 
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converges when |f| < r and |x;| <r forall j and that Dé DU (0,0) = DED" u(0, 0) 
for all 6 and m. Then it follows that the sum U (x, t) solves the given Cauchy problem 
for these values of (x, t). Since r is arbitrary, the series converges forall (x, t) € CNt 
and the sum U (x, t) solves the Cauchy problem globally. 


9. In (a), we consider a single term of the expansion of D?"u(0,0), namely 
T, +++ T,Du(0O,0) = T,---T,D¢g(O). Here each of T),..., T, is equal to some 
Aj;, or to B, and Df is the product over i of the Dj, for those T; with T; = Aj. The 
term has ||7; --- J, D¥g(0)||,, < M”|| DZ g(O)||,,, and the boundedness of the series 
involving g(0) implies that (a!)~!|| D%g(0) ||, R' < C. Let k be the number of 
factors of 7; ---T,, equal to B. Then |a| = m —k, and hence M”||D%g(0)||,, < 
CM"a!R-"—, Each T; equal to A;, has to be summed over the N values of j;, and 
we get a contribution of N’”~* from all these sums. Finally the number of such terms 
involving k factors B is the number of subsets of k elements in a set of m elements 
and is (2) ,and a! < (m — k)! by Problem 6. The desired estimate results. 


In (b), we adjust the above estimate by replacing || D¥g(0)||,, by |DE*? gO) Il 
Then Ca!R~“"— gets replaced by C(a + B)!R7"—**,, where 1 = |A|. Since 
(w+ B)! < (m—k +1), the term is < Yyfig CM" N"™ *§(m — K+ DV(Q)RO" 

+m 


In (c), we are to sum the product of the estimate in (b) by i the sum extending 


Im!" 
over all m > 0, all / > 0, and all 6 with |6| =/. Thus we are to bound 


3 3 3 m CM"Ni-"(m = ets Di yRe ee 
BIel kao Bim! 


m CM"™N"— Hm =e ne Die Re er eee 


=a} iim! 


3 
ll 
o 
~ 
ll 
o 


m=0 1=0 k=0 
=e = | =~ (” Vy M™N™-* Rok) pm 
m=0 k=0 ~ 1=0 l R k! 


—(m—k)—1 M™ N™-* R-m—-b pm 


=cyy (1-%) k! 


the first and third steps using Problem 6 and the third step requiring the assumption 


on R that Nr/R < 1. If we assume in fact that Nr/R < 1/2, then ( _ ay <2; 
and the above expression is 


co 6 gin—k+1 ygm yym—k R—(m—k) pm oo ‘ 2MrNv\m 
/(QN) 
<cyy) : =o lad ow ag 


m=0 k=0 F m=0 


the second inequality following from the series expansion of the exponential function. 
The series on the right is convergent if 2MrN/R < 1. This proves (c). 
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In (d), the analog of (a) is to consider a term T; --- T; DY Dati, where each 
T; is some A; or B. Let k be the number of factors B, so that s — k factors are 
some A; and |a| = s —k. The contributions to D¥ come from the factors A;; regard 
the m — 1 — s contributions to Denis as coming from factors of the identity 7. In 
this way the two phenomena can be handled at the same time. Ignore the fact that 
I commutes with the other matrices; it is easier to treat it as if its occurrences in 
different positions were different. The effect is the same as expanding the set of n 
matrices A; to include /, yielding a set of N + 1 matrices. The requirement M > 1 
makes it so that the estimate ||/v||,, < M|lv||,, is valid for the new member of the 
set, as well as the old members. The steps for imitating (b) and (c) are then essentially 
the same as before except that m is replaced by m — | and N is sometimes replaced 
by N+1. 


10. The crux of the matter is to show that if {u’/ (x, y)} solves the Cauchy problem 
for the first-order system, then u’/(x, y) = Di Divx, y) fori + j <m and hence 
u°-9(x, y) solves the m'-order equation. The proof proceeds by induction on j. 
The case j = 0 is okay because the first-order system has D,u'® = u't+!° for 
i < m. Suppose the identity holds for some j. Then D,u'/+! = Dyu'+!/ from 
the system, and this is = D,D,u‘/ by induction. Hence D,(u'/+! — Dyu'/) = 
0, and we obtain ui/+! — Dyuii = c(y). Put x = 0 and get u'J+1(0,y) = 
Di! f(y) = DyDI fC) = Dyu'i(O, y). Therefore c(y) = 0, and u/+! = 
Dt = Di pitt, This completes the induction. 

11. The second index (j in Problem 10) is replaced by an (N — 1)-tuple a = 
(a1,...,a@y-1). If B ¥ 0, the equation for Du? is Dyu’? = Dy,u', where j 
is the first index for which w; 4 0 and where o is obtained from f by reducing the 
j index by 1. If 6 = 0, the equations are as in Problem 10. The Cauchy data 
are ul? (0, y) = Dy f(y) except when (i, 8) = (m,0), and they are the data of 


Problem 10 when (7, 6) = (m,0). The argument now inducts on 6;,..., By_;, and 
the functions c(y) that appear are of the form c(y1,..., yy—1). The Cauchy data are 
for x = 0, and we get an equation c(y,..., Yyy-1) = 0 in one step in each case. 


12. The equations D,u'/+! = D,u'*!/ involve first partial derivatives in the y 
direction with coefficient 1, and D,u! 0 — y'+1.9 involves an undifferentiated variable 
with coefficient 1. The equation for D,u”"° involves a linear combination of variables 
and first partial derivatives in the y direction of variables, plus the term F,., which is 
an entire holomorphic function of (x, y). So the equations of the first-order system 
are as in Problems 6-9. 
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1. What needs checking is that the two charts are smoothly compatible. The 
set M,, 1 M,, is S” — {(0,...,0,+1)}, and the image of this under x; and x2 is 
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R’—{(,...,0)}. Puty; = x;/(—xn41), 80 that Ky! (y1, shee Va) Ss yes Kays 
Then 


2 OKT (yt, ey Yn) = (01/1 + tng), Xn/ (Ll + Xing) 
= (y1(1 — X%n41)/C + Xnt1), +++ Yn — Xng1)/(1 + Xn41)). 


To compute a — Xn41)/C + Xn41), we take |x| = 1 into account and write | = 


n+1 n n 
viel - et! ne yo =Xn41). Then ae y? =(1 —x? 1/0 =Xn41)" = 
+ Xn41 — Xn+1), an 


K2 OKT (00s Yn) = (nf Djar YP, ++ Yn Dan YF): 


The entries on the right are smooth functions of y since y 4 0, and the two charts are 
therefore smoothly compatible. 


3. If it is o-compact, it is Lindel6f. If it is Lindel6f, countably many charts suffice 
to cover X. If there is a countable dense set, then we can choose one chart for each 
member of the dense set, and these will have to cover X. This proves (a). For (b), 
each chart has a countable base, and the union of these countable bases, as the chart 
varies, is a countable base for X. 

4. For (a), multiplication is given by polynomial functions, which are smooth. 
Inversion, according to Cramer’s rule, is given by polynomial functions and division 
by the determinant, and inversion is therefore smooth. 

For (b), we have 


= bs Aij Urs ie L(g) oe) (1) = =>) Aijgpe ax; (end sj 
i,j 


ijrs 


=D (gA),j55) 74 (8) = 2 (BAe 35, (g). 


ITS 


For (c), the condition for smoothness, by Proposition 8.8, is that all Ax; j be 
smooth functions. Part (b) gives Axij(g) = Ag(xij) = Dy,,(BA)rsSirdjs = (BAD = 
yee git Axj, and the right side is a smooth function of the entries of g. For the 
left invariance, let F = J,, and put g’ = F-'(g) = h-!'g. We are to check 
that (dF) (Ag)(f) = Ae(f) if f is defined near g. The left side is equal to 
oes oly) = (Clg) (A))(f 0 ln) = (dln) g'(dly)i(A)(f), and the right side is 

A,(f) = (dl,)\(A)(f). These two expressions are equal by Proposition 8.7. 

Parts (d) and (e) amount to the same ne For (d), the question is whether 
Ag coexptaAt = (dc), (4 -)(f). The right side is £ f (80 expt A), and that is why (d) and 
(e) amount to the same thing. The left side is ae (go(exptA)A);s ge : (go exptA) 
by (b), and this expression equals £ J (go exp tA) by the chain rule and the formula 
¢ exptA = (exptA)A known from Basic. 
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5. For (a), fix /. Choose, for each p in L;, a compatible chart about p such that the 
closure of the domain of the chart is a compact subset of U;. The domains of these 
charts form an open cover of L;, and we extract a finite subcover. Taking the union 
of such subcovers on /, we obtain the atlas {x,}. 

For (b) and (d), the solution will be a translation into the language of smooth 
manifolds of a proof given in introducing Corollary 3.19: In (b), let the domains of 
the charts constructed at stage / be M,,,..., M,,. Lemma 3.15b of Basic constructs 
an open cover {W,,..., W,} of L; such that we is acompact subset of M,, for each j. 
A second application of Lemma 3.15b of Basic produces an open cover {V1,..., V;} 
of L; such that val is compact and ve C W; for each j. Proposition 8.2 constructs 
a smooth function g; > 0 that is 1 on ve and is 0 off W;. Then Ss gj is > 0 
on L; and has compact support in Uja1 M,,. If we write {7a} for the union of the 
sets {g1,..., g,} as / varies, then the functions gy = fin Vp ng have the required 
properties. 

For (c), we apply (b) to the smooth manifold U. The construction in (b) is arranged 
so that about each point is an open neighborhood on which only finitely many @,’s 
can be nonzero. As this point varies through K, the open neighborhoods cover K, 
and there is a finite subcover. Therefore only finitely many @,’s have the property 
that they are somewhere nonzero on K. The sum of this finite subcollection of all 
(q's is then a smooth function with values in [0, 1] that is 1 everywhere on K and has 
compact support in U. 

For (d), we argue as in (b) with two applications of Lemma 3.15b of Basic to 
produce an open cover {Vj,..., V,} of K such that for each j, he is a compact 
subset of W;, whose closure is a compact subset of U;. Part (c) constructs a smooth 
function g; > 0 that is 1 on vo and is 0 off W;. Then g = at gj is > O everywhere 
on K and has compact support in Uja1 U;. A second application of (c) produces a 
smooth function h > 0 on M with values in [0, 1] that is 1 on K and is compactly 
supported within the set where g > 0. Then g + (1 — A) is smooth and everywhere 
positive on M, and the functions g; = g;/(g + (1 —h)) have the required properties. 


F; 


6. In the notation of Proposition 8.6, the matrix [ 

uj 
which is of size k-by-n, has rank k. Choose k linearly independent columns. Possibly 
after a change of notation that will not affect the conclusion, we may assume that 


they are the first k columns. Call the n functions yj 0 F,..., yp 0 F, Xe41,---,Xn by 
OCf; =i 
the names f},..., fr. These are in C°(M,,) and have matrix [Aes of the 
Uj 
block form OF: oF; 
Ou; Ou; 
0 1 
at the point where (uv1,...,Un) = (%1(p),...,Xn(p)). The upper left corner is 


invertible by the condition of rank k, and hence the whole matrix is invertible. Then 
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the result follows from Proposition 8.4. 

F; | 

Oj Ota =CU(P)vonrtn(P)) 
which is of size k-by-n, has rank n. Choose n linearly independent rows. Since 
F, = (yj 0 F) ox«7!, Proposition 8.4 shows that the corresponding functions y; o F 
generate a system of local coordinates near p. This proves (a). 

8. A little care is needed with the definition of measure 0 for a manifold because 
the sets of measure 0 that arise are not shown to be Borel sets. However, for points 
in the intersection of the domains of two charts x; and k2, the change-of-variables 
theorem shows that the two versions of Lebesgue measure near the two images in 
Euclidean space of a point are of the form dx and (x1 0 Ky ra) dx, and the sets of 
measure 0 are the same for these. 

The solution of the problem as written is a question of localizing matters so that the 
Euclidean version of Sard’s Theorem (Theorem 6.35 of Basic) applies. For each point 
p in M, one can find a chart «, with p € M,, anda chart A, with F(p) € Nj, such 
that F (M,,) c Ny,- The Euclidean theorem applies to A, 0 Fo ke . The separability 
implies that countably many of these M,,’s cover M. We get measure 0 for the 
critical values within each F'(M,.,), and the countable union of sets of measure 0 has 
measure 0. 


7. In the notation of Proposition 8.6, the matrix [ 


9. Here we localize and apply Corollary 6.36 of Basic. 


10. The reflexive condition follows withh = 1, and the transitive condition follows 
by using the composition of two h’s. Strictly equivalent is the condition “equivalent” 
with h = 1. 


11. Substitution of the definitions gives 


Bei (x) Bii(X) = Oy | Oh, OPjx OP} OGix = Oye | hy OGix = Bil). 


This proves the first identity, and the second identity is similar. 

12. For (a), if x lies in M,, My; and y lies in F”, then the only way that h can 
have the correct mapping function x +> g,;(x) is to have g,;(x)(y) = by. hdj.x (y). 
Therefore we must have h(@;,,(y)) = Pi Bi (x)(y), and h is unique. 

In (b), if h exists, then it is apparent from the formula for it that it is a diffeomor- 
phism. In this case the function h~! exhibits the relation “equivalent” as symmetric. 


13. For (a), if x lies also in M,, M,:, then we have 
pilb) = 01 (b) = 65.1 bi.x9j 4 ) = gyi(x)(pib)) 
and hence 


hej (b) = bj. 8ki ®)(Pj(D)) = By Bei VB ii (©) (Pi(D)) = Pj eBei ) (Vi (b)) 


= $f, 814%) 8ki (*) (Pi (b)) = $7) 81 &) (Pi lb) = hii). 2 
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The sets p~! (M,;  M.) are open and cover B as j and k vary, and the consistency 
condition (>) therefore shows that the functions h;,; piece together as a single smooth 
functionh : B > B’. 

For (b), let y be in F”. Put b = ¢;.(y) in the definition of hz;(b), so that 
y= oj) (b) = p;(b), and then we have 


Dix AP;.x(¥) = Phy AD) = be Bh ej) (Pj) = Bej (XV). 


This shows that the functions x +> gx; (x) coincide with the mapping functions of h. 


Chapter IX 


1. The formula is fyjx, = Wx + LY — 51({0}), where jy” is the measure on R 
defined by w¥ (A) = fy, (—A). 


2. Both sides equal f, ®(x1,...,4n) dP. 
3. For (a), we have o7 = f(t — E)*dyunj(t) = se E)* du,(t) > 


5° P({lyn — E| = 5}). 
For (b), we calculate 


|E(®(yn)) — ®(E)| = | fp [®M — P(E) dun] < fg l®O — P(E) dun) 
= fines tenis < Sp—mjcs € Ent) + 2MP (Llyn — E| = 8) 
<€+2Mo25~”. 


In (c), let € > O be given, and choose the 6 of continuity for ® and €. Then the 
calculation in (b) applies. Since lim o = 0, the right side is < 2e for n large enough. 
For such n, we have |E(®(y,)) — ®(E)| < 2e. 

In (d), the argument of (c) depends only on the continuity of ® at E and the global 
boundedness of ®. In the situation of Theorem 9.7 with independent identically 
distributed random variables x,, we put s, = x; +---+-x, and take y, = i Sn. We 
saw that if E(x.) = E and Var(x,) = o7, then E(y,) = E and Var(y,) = 407. 
Thus (c) applies. 

4. Part (a) is a direct application of the Kolmogorov Extension Theorem. One 
starts with the measure on R! that assigns mass p to {1} and mass | — p to {0}, 
forms the n-fold product to model n independent tosses, and obtains the space for a 
sequence of tosses from the Kolmogorov Theorem. 

In (b), the expectation is p- 1+ (1 — p)-0 = p. The computation for the variance 
isp-1?+(1— p)-0? — p*=p-—p’ = pil— p). 

For (c), the answer is the number of ways of obtaining k heads and n — k tails in 
n tosses, namely ( times the probability of getting a specific sequence of k heads 
and n — k tails, which is p*(1 — p)"~*. 
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In (d), we put y, = 1 sy. In view of (c), E(yn) is op_9 @(£) () Ped — py*, 
and (a) shows that ®(F) is ®(p). The variance of y, is pO~p) , in view of (b); since 
this tends to 0, Problem 3c is applicable and establishes the limit formula. 

For (e), we go over the solution of Problem 3. The relevant facts for making 
an estimate that is uniform in p are that ® is uniformly continuous and that the 
convergence of the variance to 0 is uniform in p. 


6. For the regularity any set in F is in some F,,. The sets in F,, are of the form 
E=Ex OC kD with E C Q™ and v(E) = v,(E). Given € > 0, choose 
K compact and U open in Q” with K C E CU andv,(U — K) < €. nQ, K is 
compact, U is open, K Cc E Cc U, and v(U — K) <€. 

7. Let E = UJ, Ey disjointly in F. Since v is nonnegative additive, we have 
aw v(E,) < v(E). For the reverse inequality let € > O be given. Choose K 
compact and U,, openwith K C E,E, C U,,v(Un—En) < €/2”,andv(E—K) <e. 
Then K C |J™, U,, and the compactness of K forces K C Ls U,, for some N. 
Then v(Z) < v(K)+e < v(Uy Un) +e < Se v(U,)+e < ey V(En)+2€ < 
yn V(En) + 2€. Since € is arbitrary, v(E) < °°, v(En). 

8. The key is that (2 is a separable metric space. Every open set is therefore the 
countable union of basic open sets, which are in the various F,,’s. 


10. The collection of subsets of © that are of type J for some countable J is a 
o-algebra containing A’, and thus it contains A. 


11. Continuity cannot be ensured by conditions at only countably many points, 
as we see by altering the value of the function at a point not in a prospective such 
countable set of points. 


12. A nonempty set of A that is contained in C must be defined in terms of what 
happens at countably many points, and no such conditions are possible, just as in the 
previous problem. So the set must be empty. Since p,(C) is the supremum of p of 
all such sets, we obtain p,(C) = 0. 


13. If is in C; but not £, then the uniform continuity of @ means that «| , extends 
to a member of C. In other words, there is a member w’ of Q that is 0 on J such that 
w+a'isinC. SinceC C E,w+a’ is in E. The set E is by assumption of type J, 
and therefore the sum of any member of E with a member of &2 that vanishes on J is 
again in EF. Hence w = (w + o’) — w’ is in EF, contradiction. 

14. Problem 13 shows that the infimum of p(£) for all E in A containing C equals 
the infimum over all countable J of o(C,). Under the assumption this infimum is 1. 
Thus p*(C) = 1. 
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INDEX OF NOTATION 


See also the list of Notation and Terminology on pages xix—xxii. In the list below, 
items are alphabetized according to their key symbols. For letters the order is 
lower case, italic upper case, Roman upper case, script upper case, blackboard 
bold, and Gothic. Next come items whose key symbol is Greek, and then come 
items whose key symbol is a nonletter. The last of these are grouped by type. 


a, 153 

AX 153 
Ag, 271 
(-)°, 266 

Cks 96 
C™%(E,C), C’(E, R), 324 
Cet), 131 
Com(G), 226 
C&, 180 
C,(M), 327 
Cr Con 845 
d, 368 

d™, 250 

dx, 237 

dix, dX, 230 
(dF),, dF,, dF(p), 330 
Dj, 55 

D*%, 55 
P(D), 284 
P(x, D), 185 
Q(D), 55 
D'(M), 352 
D'(U), 180 
E,, 260 
E(x), 378 
E'(M), 352 
E'(U), 114 
Ar, pr, 389 


8f, fg, 222 
gx, xg, 222 
8, Wg, 222 
Bete(-), 338 
g(x), 351 

G", 355 

G,, 362 

G/H, 214 
GL(N,C), 213 
GL(N, F), 338 
GL(N,R), 213, 223, 371 
GLc(V), 241 

G, 310, 357, 361 


Aeon), A. 
aH, 214 
G/H, 214 
H,(x), 32 
H(f, 9), 226 
H?(R2), 100 
H, 236 
HP(RNt), 81 
Fe AD. 
Jn(r), 12 

1, 256 

bpy 227 

L(u), 19 


(M), 366 
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L*(v), 20 

L?((0, 1), 0< p <1, 109 
L?(G), 237 
Li(T), 103 
LP), 63 
Loom(M, Lg), Lige(M, Leg), 352 
LAP(G), 272 

M,, M,, 322 
M(L?(S, 12), 160 
OWN), 218 

p'(x, dy), 388 

P, 376 

P(D), 284 

P(x, D), 185 
P(x,t), P(x), 80 
P(x, 27i€), 287 
Pin (x, 271€), 287 
P(t), 16 

P.(@), 15 

Q(D), 55 

Q,, 270 

r, 257 

r(a), 150 

R(A), 150 

Rit, 80 

sgno, 242 

sing suppu, 303 
support(-), 192 
STU), Sin 3 307 
Sra gle © U3. Sip pt & Uy 396 
SL(2, R), 236, 268 
SL(N, F), 218 
SO(N), 218, 264 
Sp(N,F), 218 
SU(2), 268 
S'(RY¥), 59 

G,, 242 

(-)", 188 

T®, 102 

Tr, 187, 352 
T(M), 331, 344 


T(M,R), T(M,C), 344 
T*(M), 344, 345 
T*(M,R), 355 
T*(M,R)*, 364 
T,(M), 328 
T;(M), 345 
TrL, 51, 249 
Ux, 2 

au, Ub, 213 
UV, .U-*,-213 
U(N), 218 
xj(q), 323 

dx, 237 

¢~,, 190 

ZN , 292 

Zy, 270 


Greek 

la|, 55 

a!, 55 
D&e55 

6, 206 

A, 2, 276 
A(g), 230 
Ag(t), 232_ 
k:M, > M,, 322 
du(gH), 235 
be, 351 

bx, 379 


o(a), 150 
o1(p, &), 355 
ox(x, &), 364 
gx, 190 

dx, 338 

f*, 221 

Me, 221 
®(f), 257 
Q, 376 


: lnzerny 103 
269 
per 180 

| - lpg» 56 
180 


. Lye 


| lee, 


Other unary operations 
au, 61 

@, 153 

f’, TY, 189 

hie f*, 934 
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Binary operations 

(T,¢), 59, 181 

Ky = Ko, XIX, 129 

f *h, 237 

®, 247 

UUV, LJ; Wi, 335 
~, 357 

$1 ® g2, 361 


Other symbols 
1, 214 


about a point, 322 
absolutely uniform convergence, 6 
adele, 271 
adjoint, formal, 20 
Alaoglu’s Theorem, 120, 146 
preliminary form, 81, 82 
algebra 
associative, 121 
Banach, 122, 146 
commuting, 161 
multiplication, 160 
almost periodic, 272, 273 
amplitude, 356 
Approximation Theorem, 255, 267, 274 
Artin product formula, 270 
Ascoli’s Theorem, 145 
associated Legendre equation, 16 
associative algebra, 121 
Atiyah—Singer Index Theorem, 369 
atlas, 322 


Baire Category Theorem, 114, 264 
Baire set, xxii 

Banach algebra, 122, 146 

base space, 338 

basic separation theorem, 127 
Beltrami equation, 94 

Bernstein polynomial, 399 

Bessel function, 12, 31 

Bessel’s equation, 12 

Bohr compactification, 272 

Borel set, xxii 

Borel—Cantelli Lemma, 394 
boundary data, 2 

boundary of open set, 61 

boundary values, 2 
boundary-value problem, 2, 277 
bounding point, 126 

Brouwer Fixed-point Theorem, 145 


INDEX 


Brownian motion, 387, 400 
bundle 
coordinate vector, 338 
cotangent, 344 
space, 338 
tangent, 344 
vector, 341 


C* algebra, 157 
Calder6n—Zygmund decomposition, 86 
Calder6n—Zygmund Theorem, 83, 102, 370 
Cantor diagonal process, 225 
Cantor measure, 400 
Cauchy data, 279, 280 
Cauchy problem, 284, 320 
Cauchy—Kovalevskaya Theorem, 278, 279, 281, 
282, 318, 320 
Cauchy—Peano Existence Theorem, 145 
Cauchy—Riemann equations, 92, 276 
Cauchy—Riemann operator, 287, 318, 349 
chain rule, 331 
character, 250 
group, 250 
multiplicative, 242 
sign, 242 
characteristic, 283 
chart, 322 
atlas of, 322 
compatible, 322 
Chebyshev’s inequality, 393 
circled set, 132 
closed convex hull, 140 
closed subgroup, 217 
coin tossing, 376 
commuting algebra, 161 
compact group, 217 
compact operator, 34 
compact ring, 270 
compactification 
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460 


Bohr, 272 

one-point, 124 

Stone—Cech, 125 
compatible chart, 322 
completely continuous, 34 
complex tangent bundle, 344 
conditional probability, 382 
cone condition, 67 
constant coefficients, 279, 282, 292, 300 
continuity 

complete, 34 

left uniform, 273 

strong, 256 

uniform, 219 

weak, 267 
continuous dual, 114 
contragredient representation, 245, 266 
convergence 

absolutely uniform, 6 

in probability, 394 

uniform on compact sets, 75 
convex set, 125 
convolution 

of distributions, 192, 195 

of functions, 186, 237 

of measures, 189 
coordinate function, 338 
coordinate transformation, 338 
coordinate vector bundle, 338 

equivalence, 373 
cotangent bundle, 344 
countable, xix 
critical point, 372 
critical value, 372 
curve, 333 

integral, 333 
cyclic vector, 162 


de Rham’s Theorem, 368 
degree of homogeneity, xx, 83, 355 
degree of representation, 250 
density, 380 
derivation, 328 
derivative 

transverse, 209 

weak, 62, 103, 290 
diffeomorphism, 326 
differential, 330 
differential 1-form, 348 


Index 


smooth, 348 


differential operator, linear, 19, 353 


transpose of, 20, 353 


differentiation of distribution, 188 
dimension of smooth manifold, 322 
Dirac distribution, 201 

Dirac operator, 369 

direct limit, 139, 177 

direct sum, 247 

directed system, 177 

Dirichlet problem, 13, 288 
discrete group, 213 

disjoint union, 335 

distribution, 179, 290, 352 


arbitrary, 180 

convolution with, 192, 195 
differentiation of, 188 

Dirac, 201 

equal to a locally integrable function, 183 
equal to a smooth function, 183 
Fourier series of, 209 

Fourier transform of, 60, 202, 203 
function, 380 

given by function, 187 

Heaviside, 318 

kernel, 310, 357, 361 

localization of, 186 

of compact support, 114, 352 

of random variable, 379 

operation on, 187 

periodic, 209 

product with smooth function, 188 
support of, 181 

supported at {0}, 208 

supported on vector subspace, 208 
tempered, 58 


Divergence Theorem, 70 


eigenfunction, 19 
eigenspace, 36 
eigenvalue, 19, 36 
eigenvector, 36 
elliptic 


differential equation, 288 
operator, index, 368 
pseudodifferential operator, 315, 366 


equal to a function, distribution, 183 
equivalent coordinate vector bundles, 373 
equivalent representations, 244, 259 


unitarily, 266 
equivalent vector bundles, 373 
ergodic measure, 143, 176 
essential image, 166 
event, 376, 378 
exhausting sequence, 64, 113, 350 
expectation, 378 
expected value, 378 
extreme point, 140, 175 


F. and M. Riesz Theorem, 102 
face, 140 
Fatou’s Theorem, 81 
finite-dimensional representation, 241 
finite-dimensional topological vector space, 
111 
formal adjoint, 20 
formally self adjoint, 20 
Fourier inversion formula for compact group, 
267 
Fourier series 
multiple, 96, 98, 102 
of distribution, 209 
use in local solvability, 292 
use in separation of variables, 5 
Fourier transform 
of distribution of compact support, 203 
of tempered distribution, 60, 202 
freezing principle, 285 
Fréchet limit, 139 
Fréchet space, 139 
functional calculus, 167 
fundamental solution, 206, 290, 300, 302, 318 


Garding’s inequality, 286 

Gelfand transform, 153 

Gelfand—Mazur Theorem, 151 

general linear group, 213,371 

generalized pseudodifferential operator, 356 
transpose of, 356 

germ, 327 

Green’s formula, 20, 31, 72, 73, 206 

Green’s function, 25, 290 

group action, 222 

group character, 250 


Haar covering function, 226 
Haar measure, 223, 232 
Hahn—Banach Theorem, 126 
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half space 

Poisson integral formula, 80 

Poisson kernel, 80 
Hardy space, 100 
harmonic, 2, 69 
harmonic measure, 317 
harmonic polynomial, 244, 263 
heat equation, 2, 6 
heat operator, 287 
Heaviside distribution, 318 
Heaviside function, 318 
Hermite polynomial, 32 
Hermite’s equation, 32 
Hermitian matrix, 36 
Hilbert transform, xxii, 83, 92, 101,211 
Hilbert-Schmidt norm, 47 
Hilbert-Schmidt operator, 47 
Hilbert-Schmidt Theorem, 22, 25, 42, 43 
Hirzebruch—Riemann-Roch Theorem, 369 
Hodge theory, 368 
holomorphic polynomial, 243 
homogeneous function, xx, 83, 355 
homogeneous partial differential equation, 

1,276 

homogeneous space, 214 
homomorphism of topological groups, 213 
Hopf maximum principle, 297 
hyperbolic, 289 


idele, 272 
identically distributed, 380 
identification space, 335 
identity component, 264 
identity element, 214 
independent events, 382 
independent random variables, 384 
index of elliptic operator, 368 
indicator function, xxi 
inductive limit topology, 139, 174 
initial data, 2 
initial-value problem, 277 
integrable, locally, 62 
integral curve, 333 
integral operator, 41 

trace of , 98 
Interior Mapping Theorem, 173 
internal point, 126 
invariant subspace, 243, 259 
inverse, 149 
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invertible, 149 local coordinate system, 323, 372 
involution, 157 local neighborhood base, 132 
irreducible representation, 245, 259 local solvability, 292 
isomorphism of topological groups, 213 localization of distribution, 186 
isomorphism of topological vector spaces, 106 locally compact abelian group, 270 
isotypic subspace, 261 locally compact field, 270 
locally compact group, 217 
joint distribution, 380 locally compact ring, 271 
locally compact topological vector space, 111, 
kernel 265 
distribution, 310, 357, 361 locally convex, 128 
of integral operator, 41 locally integrable, 62 
Poisson 15, 80 
Kolmogorov Extension Theorem, 390, 400 manifold, 322 
Kolmogorov’s inequality, 395 Riemannian, 349 
Krein—Milman Theorem, 140 smooth, 322 
mapping function, 373 
Laplace equation, 2, 13 Marcinkiewicz Interpolation Theorem, 95 
Laplacian, 2, 206, 276, 287 Markov—Kakutani Theorem, 143 
Law of Large Numbers matrix 
Strong, 394 Hermitian, 36 
Weak, 394 orthogonal, 218 
left almost periodic, 272 rotation, 218 
left coset, 214 trace of , 249 
left Haar measure, 223 unitary, 37,218 
left inverse, 149 matrix coefficient, 250 
left parametrix, 310 matrix representation, 241 
left uniform continuity, 273 maximal abelian self-adjoint subalgebra, 161 
left-invariant vector field, 371 maximal ideal, 152 
left-regular representation, 256 maximum principle, 297 
Legendre equation mean-value property, 72, 73 
associated, 16 measurable set of type F, 389 
ordinary, 16 measure 
Legendre polynomial, 16, 31 Cantor, 400 
Leibniz rule, 63 harmonic, 317 
Lewy example, 286, 349 smooth, 351 
LF topology, 139 measure 0,372 
Lie group, 349 metric, Riemannian, 349 
line segment, 140 modular function, 230 
linear modulus of ellipticity, 296 
differential operator, 19, 353 Monotone Class Lemma, 170 
functional, multiplicative, 122, 148, 152 multi-index, 55 
homogeneous partial differential equation, multiple Fourier series, 96,98, 102 
276 of distribution, 209 
operator (see operator) use in local solvability, 292 
partial differential equation, 1, 276 multiplication algebra, 160 
partial differential operator, 185, 188 multiplication of smooth function and 
topological space, 106 distribution, 188 


Liouville, 78 multiplicative character, 242 
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multiplicative linear functional, 122, 148, 152 homogeneous, 1, 276 
multiplicity, 251 hyperbolic, 289 
linear, 1,276 
negative, xix linear homogeneous, 276 
noncharacteristic, 283 order, 276 
norm, Hilbert-Schmidt, 47 system, 276 
normal operator, 52, 165 partial differential operator 
normed linear space, 107 elliptic, 288 
hyperbolic, 289 
one-point compactification, 124 linear, 185, 188 
operation on distribution, 187 transpose of, 185, 353 
by —1 in domain, 189 partition of unity, 65, 113, 174, 351, 371 
by convolution, 192, 195 periodic distribution, 209 
by linear partial differential operator, 188 Peter-Weyl Theorem, 252 
of differentiation, 188 Picard—Lindel6éf Existence Theorem, 145 
of Fourier transform, 202 Plancherel formula for compact group, 254, 257 
of multiplication, 188 Poisson integral formula for half space, 80 
of transpose, 187 Poisson integral formula for unit disc, 15 
operator Poisson kernel for half space, 80 
compact, 34 Poisson kernel for unit disc, 15 
completely continuous, 34 Poisson’s equation, 291 
differential, linear, 353 polynomial 
Dirac, 369 Bernstein, 399 
elliptic, 288 harmonic, 244, 263 
elliptic pseudodifferential, 315, 366 holomorphic, 243 
generalized pseudodifferential, 356 trigonometric, 254 
Hilbert—Schmidt, 47 positive, xix 
hyperbolic, 289 positive definite function, 142, 176 
integral, 41 positive semidefinite operator, 165 
normal, 52, 165 Principal Axis Theorem, 289 
orthogonal, 45 principal symbol, 287, 355, 364 
positive semidefinite, 165 probability, 376, 378 
pseudodifferential, 306, 308, 362 conditional, 382 
smoothing, 291, 301, 308, 357 probability space, 378 
trace of, 51, 98 product of topological groups, 213 
trace-class, 49 projection, 338 
transpose of, 185 projective space, 371 
unitary, 45, 163, 165 properly supported, 313, 357, 361 
order of differential equation, 276 pseudodifferential operator, 306, 308, 362 
order of differential operator, 185, 353 elliptic, 315, 366 
orthogonal group, 218 generalized, 356 
orthogonal matrix, 218 transpose of, 308, 356 
orthogonal operator, 45 pseudolocal, 311, 357, 362 
pseudonorm, 56 
p-adic integer, 270 pull back, 221 
p-adic norm, 269 push forward, 221 
parametrix, 292, 301, 307, 310, 315 
partial differential equation, 276 quotient of topological vector space, 110 


elliptic, 288 quotient space, 214 
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of Lie group, 349 


Radon-Nikodym Theorem, 224 
random number, 377 
random variable, 378 
rank, 338 
real tangent bundle, 344 
regularizing, 61 
Rellich’s Lemma, 368 
representation 
contragredient, 245, 266 
finite-dimensional, 241 
irreducible, 245, 259 
left-regular, 256 
matrix, 241 
right-regular, 257 
standard, 242 
trivial, 242 
unitary, 245, 256 
resolvent, 150 
resolvent set, 150 
restricted direct product, 178,271 
Riemann—Roch Theorem, 369 
Riemannian manifold, 349 
Riemannian metric, 349 
Riesz Convexity Theorem, 95 
Riesz Representation Theorem, 118, 164, 220 
Riesz transform, 93 
right almost periodic, 272 
right group action, 223 
right Haar measure, 223 
right inverse, 149 
right parametrix, 307 
right-regular representation, 257 
rotation group, 218 
rotation matrix, 218 


Sard’s Theorem, 372 
satisfy cone condition, 67 
Schauder—-Tychonoff Theorem, 144 
Schrédinger’s equation, 32 
Schur orthogonality, 249 
Schur’s Lemma, 247 
Schwartz Kernel Theorem, 209, 361 
Schwartz space, 55 
Schwarz Reflection Principle, 78 
section, 347 

smooth, 347 
self adjoint, formally, 20 
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self-adjoint subalgebra, 161 
seminorm, 56 
separable, xxi 
separating family of seminorms, 57, 107 
separation of variables, 1,3 
sign character, 242 
singular support, 183, 303, 362 
smooth curve, 333 
smooth differential 1-form, 348 
smooth function, 323, 326 
smooth manifold, 322 
dimension, 322 
smooth measure, 351 
smooth section, 347 
smooth structure, 322 
smooth vector field, 331, 332, 348 
smoothing operator, 291, 301, 308, 357 
Sobolev space, 63, 100, 103, 290, 308, 366 
Sobolev’s Theorem, 67, 69, 100, 103, 368 
space-boundary data, 2 
special linear group, 218 
Spectral Mapping Theorem, 155 
spectral radius, 150 
spectral radius formula, 155 
Spectral Theorem 
finite-dimensional, 37 
for bounded normal operators, 166 
for bounded self-adjoint operator, 165 
for compact self-adjoint operator, 39 
for unbounded self-adjoint operator, 172 
spectrum, 150, 167 
standard representation, 242 
standard symbol, 307 
Stieltjes integral, 379 
stochastic process, 389 
Stone Representation Theorem, 121, 147, 176 
Stone—Weierstrass Theorem, 31, 124, 169, 263 
Stone-Cech compactification, 125 
strict equivalence, 341 
strict inductive limit topology, 139 
strong continuity, 256 
Strong Law of Large Numbers, 394 
Sturm’s Theorem, 5, 21 
Sturm-Liouville eigenvalue problem, 20 
Sturm—Liouville theory, 5, 19, 172 
superposition principle, 1 
support function of convex set, 126 
support of distribution, 115, 181,352 
singular, 183, 303, 362 


support of function, xxi, 324 
supported properly, 313, 357, 361 
symbol, 287, 306 
principal, 287, 355, 364 
standard, 307 
symplectic group, 218 
system of partial differential equations, 276 


tangent bundle, 332, 344 
tangent space, 328 
tangent vector, 328 
tempered distribution, 58 
topological field, 269 
topological group, 213 
isomorphism for, 213 
topological ring, 270 
topological vector space, 106 
defined by seminorms, 107 
finite-dimensional, 111 
isomorphism for, 106 
locally compact, 111, 265 
locally convex, 128 
quotient of, 110 
trace of integral operator, 98 
trace of linear map, 249 
trace of matrix, 249 
trace of operator, 51 
trace-class operator, 49 
transition function, 338 
transition matrix, 338 
translate, 190 
transpose, 187, 352 
of generalized pseuodifferential operator, 
356 
of operator, 185 
of ordinary differential operator, 20 
of partial differential operator, 185, 353 
of pseudodifferential operator, 308, 356 
transverse derivative, 209 
trigonometric polynomial, 254 
trivial representation, 242 
two-sided parametrix, 315 
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Tychonoff Product Theorem, 120, 225 


ultrametric inequality, 269 
uniform continuity, 219 
left, 273 
uniform convergence on compact sets, 75 
unimodular group, 232 
unit disc 
Poisson integral formula, 15 
Poisson kernel, 15 
unit sphere, 370 
unitarily equivalent, 266 
unitary group, 218 
unitary matrix, 37,218 
unitary operator, 45, 163, 165 
unitary representation, 245, 256 
Urysohn Metrization Theorem, 323 


variance, 393 

vector bundle, 341 
coordinate, 338 
equivalence, 373 

vector field, 331,348 
left-invariant, 371 
smooth, 331, 332, 348 

vibrating drum, 18 

vibrating string, 17 


wave equation, 3, 17 

wave operator, 287 

weak continuity, 267 

weak derivative, 62, 103,290 

Weak Law of Large Numbers, 394 

weak topology on normed linear space, 108, 
116 

weak-star topology on dual of normed linear 
space, 109, 116 

weakly analytic, 150 

Weierstrass Approximation Theorem, 399 

Wey] integration formula, 269 

Wiener Covering Lemma, 87 
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